49c25af89c6ad02b60a1a2bcbe16ecef330782cf
389 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
d70c95bd81 |
Merge 5.10.180 into android12-5.10-lts
Changes in 5.10.180 seccomp: Move copy_seccomp() to no failure path. counter: 104-quad-8: Fix race condition between FLAG and CNTR reads KVM: arm64: Fix buffer overflow in kvm_arm_set_fw_reg() wifi: brcmfmac: slab-out-of-bounds read in brcmf_get_assoc_ies() drm/fb-helper: set x/yres_virtual in drm_fb_helper_check_var bluetooth: Perform careful capability checks in hci_sock_ioctl() x86/fpu: Prevent FPU state corruption USB: serial: option: add UNISOC vendor and TOZED LT70C product driver core: Don't require dynamic_debug for initcall_debug probe timing iio: adc: palmas_gpadc: fix NULL dereference on rmmod ASoC: Intel: bytcr_rt5640: Add quirk for the Acer Iconia One 7 B1-750 asm-generic/io.h: suppress endianness warnings for readq() and writeq() wireguard: timers: cast enum limits members to int in prints PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock PCI: qcom: Fix the incorrect register usage in v2.7.0 config USB: dwc3: fix runtime pm imbalance on probe errors USB: dwc3: fix runtime pm imbalance on unbind hwmon: (k10temp) Check range scale when CUR_TEMP register is read-write hwmon: (adt7475) Use device_property APIs when configuring polarity posix-cpu-timers: Implement the missing timer_wait_running callback perf sched: Cast PTHREAD_STACK_MIN to int as it may turn into sysconf(__SC_THREAD_STACK_MIN_VALUE) blk-mq: release crypto keyslot before reporting I/O complete blk-crypto: make blk_crypto_evict_key() return void blk-crypto: make blk_crypto_evict_key() more robust ext4: use ext4_journal_start/stop for fast commit transactions staging: iio: resolver: ads1210: fix config mode xhci: fix debugfs register accesses while suspended tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem MIPS: fw: Allow firmware to pass a empty env ipmi:ssif: Add send_retries increment ipmi: fix SSIF not responding under certain cond. kheaders: Use array declaration instead of char pwm: meson: Fix axg ao mux parents pwm: meson: Fix g12a ao clk81 name ring-buffer: Sync IRQ works before buffer destruction crypto: api - Demote BUG_ON() in crypto_unregister_alg() to a WARN_ON() crypto: safexcel - Cleanup ring IRQ workqueues on load failure rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed reiserfs: Add security prefix to xattr name in reiserfs_security_write() KVM: nVMX: Emulate NOPs in L2, and PAUSE if it's not intercepted relayfs: fix out-of-bounds access in relay_file_read writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs i2c: omap: Fix standard mode false ACK readings iommu/amd: Fix "Guest Virtual APIC Table Root Pointer" configuration in IRTE Revert "ubifs: dirty_cow_znode: Fix memleak in error handling path" ubifs: Fix memleak when insert_old_idx() failed ubi: Fix return value overwrite issue in try_write_vid_and_data() ubifs: Free memory for tmpfile name sound/oss/dmasound: fix build when drivers are mixed =y/=m parisc: Fix argument pointer in real64_call_asm() nilfs2: do not write dirty data after degenerating to read-only nilfs2: fix infinite loop in nilfs_mdt_get_block() md/raid10: fix null-ptr-deref in raid10_sync_request mailbox: zynqmp: Fix IPI isr handling mailbox: zynqmp: Fix typo in IPI documentation wifi: rtl8xxxu: RTL8192EU always needs full init clk: rockchip: rk3399: allow clk_cifout to force clk_cifout_src to reparent rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check selftests/resctrl: Return NULL if malloc_and_init_memory() did not alloc mem selftests/resctrl: Check for return value after write_schemata() selinux: fix Makefile dependencies of flask.h selinux: ensure av_permissions.h is built when needed tpm, tpm_tis: Do not skip reset of original interrupt vector tpm, tpm_tis: Claim locality before writing TPM_INT_ENABLE register tpm, tpm_tis: Disable interrupts if tpm_tis_probe_irq() failed tpm, tpm_tis: Claim locality before writing interrupt registers tpm, tpm: Implement usage counter for locality tpm, tpm_tis: Claim locality when interrupts are reenabled on resume erofs: stop parsing non-compact HEAD index if clusterofs is invalid erofs: fix potential overflow calculating xattr_isize drm/rockchip: Drop unbalanced obj unref drm/vgem: add missing mutex_destroy drm/probe-helper: Cancel previous job before starting new one soc: ti: pm33xx: Enable basic PM runtime support for genpd soc: ti: pm33xx: Fix refcount leak in am33xx_pm_probe arm64: dts: renesas: r8a77990: Remove bogus voltages from OPP table arm64: dts: renesas: r8a774c0: Remove bogus voltages from OPP table drm/msm/disp/dpu: check for crtc enable rather than crtc active to release shared resources EDAC/skx: Fix overflows on the DRAM row address mapping arrays arm64: dts: qcom: msm8998: Fix stm-stimulus-base reg name arm64: dts: qcom: sdm845: correct dynamic power coefficients arm64: dts: qcom: sdm845: Fix the PCI I/O port range arm64: dts: qcom: msm8998: Fix the PCI I/O port range arm64: dts: qcom: ipq8074: Fix the PCI I/O port range arm64: dts: qcom: msm8996: Fix the PCI I/O port range ARM: dts: qcom: ipq4019: Fix the PCI I/O port range ARM: dts: qcom: ipq8064: reduce pci IO size to 64K ARM: dts: qcom: ipq8064: Fix the PCI I/O port range x86/MCE/AMD: Use an u64 for bank_map media: bdisp: Add missing check for create_workqueue firmware: qcom_scm: Clear download bit during reboot drm/bridge: adv7533: Fix adv7533_mode_valid for adv7533 and adv7535 media: max9286: Free control handler drm/msm/adreno: Defer enabling runpm until hw_init() drm/msm/adreno: drop bogus pm_runtime_set_active() drm: msm: adreno: Disable preemption on Adreno 510 ACPI: processor: Fix evaluating _PDC method when running as Xen dom0 mmc: sdhci-of-esdhc: fix quirk to ignore command inhibit for data ARM: dts: gta04: fix excess dma channel usage drm/lima/lima_drv: Add missing unwind goto in lima_pdev_probe() regulator: core: Consistently set mutex_owner when using ww_mutex_lock_slow() regulator: core: Avoid lockdep reports when resolving supplies x86/apic: Fix atomic update of offset in reserve_eilvt_offset() media: rkvdec: fix use after free bug in rkvdec_remove media: dm1105: Fix use after free bug in dm1105_remove due to race condition media: saa7134: fix use after free bug in saa7134_finidev due to race condition media: rcar_fdp1: simplify error check logic at fdp_open() media: rcar_fdp1: fix pm_runtime_get_sync() usage count media: rcar_fdp1: Make use of the helper function devm_platform_ioremap_resource() media: rcar_fdp1: Fix the correct variable assignments media: rcar_fdp1: Fix refcount leak in probe and remove function media: rc: gpio-ir-recv: Fix support for wake-up media: venus: vdec: Fix non reliable setting of LAST flag media: venus: vdec: Make decoder return LAST flag for sufficient event media: venus: preserve DRC state across seeks media: venus: vdec: Handle DRC after drain media: venus: dec: Fix handling of the start cmd regulator: stm32-pwr: fix of_iomap leak x86/ioapic: Don't return 0 from arch_dynirq_lower_bound() arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step debugobject: Prevent init race with static objects drm/i915: Make intel_get_crtc_new_encoder() less oopsy tick/sched: Use tick_next_period for lockless quick check tick/sched: Reduce seqcount held scope in tick_do_update_jiffies64() tick/sched: Optimize tick_do_update_jiffies64() further tick: Get rid of tick_period tick/common: Align tick period with the HZ tick. wifi: ath6kl: minor fix for allocation size wifi: ath9k: hif_usb: fix memory leak of remain_skbs wifi: ath5k: fix an off by one check in ath5k_eeprom_read_freq_list() wifi: ath6kl: reduce WARN to dev_dbg() in callback tools: bpftool: Remove invalid \' json escape wifi: rtw88: mac: Return the original error from rtw_pwr_seq_parser() wifi: rtw88: mac: Return the original error from rtw_mac_power_switch() bpf: take into account liveness when propagating precision bpf: fix precision propagation verbose logging scm: fix MSG_CTRUNC setting condition for SO_PASSSEC bpf: Remove misleading spec_v1 check on var-offset stack read vlan: partially enable SIOCSHWTSTAMP in container net/packet: annotate accesses to po->xmit net/packet: convert po->origdev to an atomic flag net/packet: convert po->auxdata to an atomic flag scsi: target: Rename struct sense_info to sense_detail scsi: target: Rename cmd.bad_sector to cmd.sense_info scsi: target: Make state_list per CPU scsi: target: Fix multiple LUN_RESET handling scsi: target: iscsit: Fix TAS handling during conn cleanup scsi: megaraid: Fix mega_cmd_done() CMDID_INT_CMDS f2fs: handle dqget error in f2fs_transfer_project_quota() f2fs: enforce single zone capacity f2fs: apply zone capacity to all zone type f2fs: compress: fix to call f2fs_wait_on_page_writeback() in f2fs_write_raw_pages() crypto: caam - Clear some memory in instantiate_rng crypto: sa2ul - Select CRYPTO_DES wifi: rtlwifi: fix incorrect error codes in rtl_debugfs_set_write_rfreg() wifi: rtlwifi: fix incorrect error codes in rtl_debugfs_set_write_reg() net: qrtr: correct types of trace event parameters selftests/bpf: Wait for receive in cg_storage_multi test bpftool: Fix bug for long instructions in program CFG dumps crypto: drbg - make drbg_prepare_hrng() handle jent instantiation errors crypto: drbg - Only fail when jent is unavailable in FIPS mode xsk: Fix unaligned descriptor validation f2fs: fix to avoid use-after-free for cached IPU bio scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup() net: ethernet: stmmac: dwmac-rk: fix optional phy regulator handling bpf, sockmap: fix deadlocks in the sockhash and sockmap nvme: handle the persistent internal error AER nvme: fix async event trace event nvme-fcloop: fix "inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage" bpf, sockmap: Revert buggy deadlock fix in the sockhash and sockmap md/raid10: fix leak of 'r10bio->remaining' for recovery md/raid10: fix memleak for 'conf->bio_split' md/raid10: fix memleak of md thread wifi: iwlwifi: yoyo: Fix possible division by zero wifi: iwlwifi: fw: move memset before early return jdb2: Don't refuse invalidation of already invalidated buffers wifi: iwlwifi: make the loop for card preparation effective wifi: iwlwifi: mvm: check firmware response size wifi: iwlwifi: fw: fix memory leak in debugfs ixgbe: Allow flow hash to be set via ethtool ixgbe: Enable setting RSS table to default values bpf: Don't EFAULT for getsockopt with optval=NULL netfilter: nf_tables: don't write table validation state without mutex net/sched: sch_fq: fix integer overflow of "credit" ipv4: Fix potential uninit variable access bug in __ip_make_skb() Revert "Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work" netlink: Use copy_to_user() for optval in netlink_getsockopt(). net: amd: Fix link leak when verifying config failed tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp. ipmi: ASPEED_BT_IPMI_BMC: select REGMAP_MMIO instead of depending on it pstore: Revert pmsg_lock back to a normal mutex usb: host: xhci-rcar: remove leftover quirk handling usb: dwc3: gadget: Change condition for processing suspend event fpga: bridge: fix kernel-doc parameter description iio: light: max44009: add missing OF device matching spi: spi-imx: using pm_runtime_resume_and_get instead of pm_runtime_get_sync spi: imx: Don't skip cleanup in remove's error path usb: gadget: udc: renesas_usb3: Fix use after free bug in renesas_usb3_remove due to race condition PCI: imx6: Install the fault handler only on compatible match ASoC: es8316: Use IRQF_NO_AUTOEN when requesting the IRQ ASoC: es8316: Handle optional IRQ assignment linux/vt_buffer.h: allow either builtin or modular for macros spi: qup: Don't skip cleanup in remove's error path spi: fsl-spi: Fix CPM/QE mode Litte Endian vmci_host: fix a race condition in vmci_host_poll() causing GPF of: Fix modalias string generation PCI/EDR: Clear Device Status after EDR error recovery ia64: mm/contig: fix section mismatch warning/error ia64: salinfo: placate defined-but-not-used warning scripts/gdb: bail early if there are no clocks scripts/gdb: bail early if there are no generic PD coresight: etm_pmu: Set the module field ASoC: fsl_mqs: move of_node_put() to the correct location spi: cadence-quadspi: fix suspend-resume implementations i2c: cadence: cdns_i2c_master_xfer(): Fix runtime PM leak on error path uapi/linux/const.h: prefer ISO-friendly __typeof__ sh: sq: Fix incorrect element size for allocating bitmap buffer usb: gadget: tegra-xudc: Fix crash in vbus_draw usb: chipidea: fix missing goto in `ci_hdrc_probe` usb: mtu3: fix kernel panic at qmu transfer done irq handler firmware: stratix10-svc: Fix an NULL vs IS_ERR() bug in probe tty: serial: fsl_lpuart: adjust buffer length to the intended size serial: 8250: Add missing wakeup event reporting staging: rtl8192e: Fix W_DISABLE# does not work after stop/start spmi: Add a check for remove callback when removing a SPMI driver macintosh/windfarm_smu_sat: Add missing of_node_put() powerpc/mpc512x: fix resource printk format warning powerpc/wii: fix resource printk format warnings powerpc/sysdev/tsi108: fix resource printk format warnings macintosh: via-pmu-led: requires ATA to be set powerpc/rtas: use memmove for potentially overlapping buffer copy perf/core: Fix hardlockup failure caused by perf throttle clk: at91: clk-sam9x60-pll: fix return value check RDMA/siw: Fix potential page_array out of range access RDMA/rdmavt: Delete unnecessary NULL check workqueue: Rename "delayed" (delayed by active management) to "inactive" workqueue: Fix hung time report of worker pools rtc: omap: include header for omap_rtc_power_off_program prototype RDMA/mlx4: Prevent shift wrapping in set_user_sq_size() rtc: meson-vrtc: Use ktime_get_real_ts64() to get the current time power: supply: generic-adc-battery: fix unit scaling clk: add missing of_node_put() in "assigned-clocks" property parsing RDMA/siw: Remove namespace check from siw_netdev_event() RDMA/cm: Trace icm_send_rej event before the cm state is reset RDMA/srpt: Add a check for valid 'mad_agent' pointer IB/hfi1: Fix SDMA mmu_rb_node not being evicted in LRU order IB/hfi1: Add AIP tx traces IB/hfi1: Add additional usdma traces IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests NFSv4.1: Always send a RECLAIM_COMPLETE after establishing lease firmware: raspberrypi: Introduce devm_rpi_firmware_get() input: raspberrypi-ts: Release firmware handle when not needed Input: raspberrypi-ts - fix refcount leak in rpi_ts_probe RDMA/mlx5: Fix flow counter query via DEVX SUNRPC: remove the maximum number of retries in call_bind_status RDMA/mlx5: Use correct device num_ports when modify DC clocksource/drivers/davinci: Fix memory leak in davinci_timer_register when init fails openrisc: Properly store r31 to pt_regs on unhandled exceptions ext4: fix use-after-free read in ext4_find_extent for bigalloc + inline leds: TI_LMU_COMMON: select REGMAP instead of depending on it dmaengine: mv_xor_v2: Fix an error code. leds: tca6507: Fix error handling of using fwnode_property_read_string pwm: mtk-disp: Don't check the return code of pwmchip_remove() pwm: mtk-disp: Adjust the clocks to avoid them mismatch pwm: mtk-disp: Disable shadow registers before setting backlight values phy: tegra: xusb: Add missing tegra_xusb_port_unregister for usb2_port and ulpi_port dmaengine: dw-edma: Fix to change for continuous transfer dmaengine: dw-edma: Fix to enable to issue dma request on DMA processing dmaengine: at_xdmac: do not enable all cyclic channels thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe mfd: tqmx86: Do not access I2C_DETECT register through io_base mfd: tqmx86: Remove incorrect TQMx90UC board ID mfd: tqmx86: Add support for TQMx110EB and TQMxE40x mfd: tqmx86: Specify IO port register range more precisely mfd: tqmx86: Correct board names for TQMxE39x afs: Fix updating of i_size with dv jump from server scripts/gdb: fix lx-timerlist for Python3 btrfs: scrub: reject unsupported scrub flags s390/dasd: fix hanging blockdevice after request requeue ia64: fix an addr to taddr in huge_pte_offset() dm clone: call kmem_cache_destroy() in dm_clone_init() error path dm integrity: call kmem_cache_destroy() in dm_integrity_init() error path dm flakey: fix a crash with invalid table line dm ioctl: fix nested locking in table_clear() to remove deadlock concern perf auxtrace: Fix address filter entire kernel size perf intel-pt: Fix CYC timestamps after standalone CBR arm64: Always load shadow stack pointer directly from the task struct arm64: Stash shadow stack pointer in the task struct on interrupt debugobject: Ensure pool refill (again) sound/oss/dmasound: fix 'dmasound_setup' defined but not used arm64: dts: qcom: sdm845: correct dynamic power coefficients scsi: target: core: Avoid smp_processor_id() in preemptible code netfilter: nf_tables: deactivate anonymous set from preparation phase tty: create internal tty.h file tty: audit: move some local functions out of tty.h tty: move some internal tty lock enums and functions out of tty.h tty: move some tty-only functions to drivers/tty/tty.h tty: clean include/linux/tty.h up tty: Prevent writing chars during tcsetattr TCSADRAIN/FLUSH ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus crypto: ccp - Clear PSP interrupt status register before calling handler mailbox: zynq: Switch to flexible array to simplify code mailbox: zynqmp: Fix counts of child nodes dm verity: skip redundant verity_handle_err() on I/O errors dm verity: fix error handling for check_at_most_once on FEC scsi: qedi: Fix use after free bug in qedi_remove() net/ncsi: clear Tx enable mode when handling a Config required AEN net/sched: cls_api: remove block_cb from driver_list before freeing sit: update dev->needed_headroom in ipip6_tunnel_bind_dev() net: dsa: mv88e6xxx: add mv88e6321 rsvd2cpu writeback: fix call of incorrect macro watchdog: dw_wdt: Fix the error handling path of dw_wdt_drv_probe() net/sched: act_mirred: Add carrier check sfc: Fix module EEPROM reporting for QSFP modules rxrpc: Fix hard call timeout units octeontx2-pf: Disable packet I/O for graceful exit octeontx2-vf: Detach LF resources on probe cleanup ionic: remove noise from ethtool rxnfc error msg af_packet: Don't send zero-byte data in packet_sendmsg_spkt(). drm/amdgpu: add a missing lock for AMDGPU_SCHED ALSA: caiaq: input: Add error handling for unsupported input methods in `snd_usb_caiaq_input_init` net: dsa: mt7530: fix corrupt frames using trgmii on 40 MHz XTAL MT7621 virtio_net: split free_unused_bufs() virtio_net: suppress cpu stall when free_unused_bufs net: enetc: check the index of the SFI rather than the handle perf vendor events power9: Remove UTF-8 characters from JSON files perf pmu: zfree() expects a pointer to a pointer to zero it after freeing its contents perf map: Delete two variable initialisations before null pointer checks in sort__sym_from_cmp() crypto: sun8i-ss - Fix a test in sun8i_ss_setup_ivs() perf symbols: Fix return incorrect build_id size in elf_read_build_id() btrfs: fix btrfs_prev_leaf() to not return the same key twice btrfs: don't free qgroup space unless specified btrfs: print-tree: parent bytenr must be aligned to sector size cifs: fix pcchunk length type in smb2_copychunk_range platform/x86: touchscreen_dmi: Add upside-down quirk for GDIX1002 ts on the Juno Tablet platform/x86: touchscreen_dmi: Add info for the Dexp Ursus KX210i inotify: Avoid reporting event with invalid wd sh: math-emu: fix macro redefined warning sh: mcount.S: fix build error when PRINTK is not enabled sh: init: use OF_EARLY_FLATTREE for early init sh: nmi_debug: fix return value of __setup handler remoteproc: stm32: Call of_node_put() on iteration error remoteproc: st: Call of_node_put() on iteration error ARM: dts: exynos: fix WM8960 clock name in Itop Elite ARM: dts: s5pv210: correct MIPI CSIS clock name f2fs: fix potential corruption when moving a directory drm/panel: otm8009a: Set backlight parent to panel device drm/amdgpu: fix an amdgpu_irq_put() issue in gmc_v9_0_hw_fini() drm/amdgpu/gfx: disable gfx9 cp_ecc_error_irq only when enabling legacy gfx ras drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspend HID: wacom: Set a default resolution for older tablets HID: wacom: insert timestamp to packed Bluetooth (BT) events KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with vcpu_mask==NULL KVM: x86: do not report a vCPU as preempted outside instruction boundaries ext4: fix WARNING in mb_find_extent ext4: avoid a potential slab-out-of-bounds in ext4_group_desc_csum ext4: fix data races when using cached status extents ext4: check iomap type only if ext4_iomap_begin() does not fail ext4: improve error recovery code paths in __ext4_remount() ext4: fix deadlock when converting an inline directory in nojournal mode ext4: add bounds checking in get_max_inline_xattr_value_size() ext4: bail out of ext4_xattr_ibody_get() fails for any reason ext4: remove a BUG_ON in ext4_mb_release_group_pa() ext4: fix invalid free tracking in ext4_xattr_move_to_block() serial: 8250: Fix serial8250_tx_empty() race with DMA Tx drbd: correctly submit flush bio on barrier KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior KVM: x86: Fix recording of guest steal time / preempted status KVM: Fix steal time asm constraints KVM: x86: Remove obsolete disabling of page faults in kvm_arch_vcpu_put() KVM: x86: do not set st->preempted when going back to user space KVM: x86: revalidate steal time cache if MSR value changes KVM: x86: do not report preemption if the steal time cache is stale KVM: x86: move guest_pv_has out of user_access section printk: declare printk_deferred_{enter,safe}() in include/linux/printk.h drm/exynos: move to use request_irq by IRQF_NO_AUTOEN flag mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock drm/amd/display: Fix hang when skipping modeset Linux 5.10.180 Change-Id: Ie0c8ae79d56d844ec23ec277d91d4c70c3e1e9a8 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
![]() |
6f849f24da |
Merge e0dd13b49d ("wifi: rtl8xxxu: RTL8192EU always needs full init") into android12-5.10-lts
Steps on the way to 5.10.180 Change-Id: Id1ae1d6b019603d17be21ebc68f399eb60bde38a Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
![]() |
3b7798b42e |
writeback: fix call of incorrect macro
[ Upstream commit 3e46c89c74f2c38e5337d2cf44b0b551adff1cb4 ]
the variable 'history' is of type u16, it may be an error
that the hweight32 macro was used for it
I guess macro hweight16 should be used
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes:
|
||
![]() |
2b00b2a0e6 |
writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs
[ Upstream commit 1ba1199ec5747f475538c0d25a32804e5ba1dfde ] KASAN report null-ptr-deref: ================================================================== BUG: KASAN: null-ptr-deref in bdi_split_work_to_wbs+0x5c5/0x7b0 Write of size 8 at addr 0000000000000000 by task sync/943 CPU: 5 PID: 943 Comm: sync Tainted: 6.3.0-rc5-next-20230406-dirty #461 Call Trace: <TASK> dump_stack_lvl+0x7f/0xc0 print_report+0x2ba/0x340 kasan_report+0xc4/0x120 kasan_check_range+0x1b7/0x2e0 __kasan_check_write+0x24/0x40 bdi_split_work_to_wbs+0x5c5/0x7b0 sync_inodes_sb+0x195/0x630 sync_inodes_one_sb+0x3a/0x50 iterate_supers+0x106/0x1b0 ksys_sync+0x98/0x160 [...] ================================================================== The race that causes the above issue is as follows: cpu1 cpu2 -------------------------|------------------------- inode_switch_wbs INIT_WORK(&isw->work, inode_switch_wbs_work_fn) queue_rcu_work(isw_wq, &isw->work) // queue_work async inode_switch_wbs_work_fn wb_put_many(old_wb, nr_switched) percpu_ref_put_many ref->data->release(ref) cgwb_release queue_work(cgwb_release_wq, &wb->release_work) // queue_work async &wb->release_work cgwb_release_workfn ksys_sync iterate_supers sync_inodes_one_sb sync_inodes_sb bdi_split_work_to_wbs kmalloc(sizeof(*work), GFP_ATOMIC) // alloc memory failed percpu_ref_exit ref->data = NULL kfree(data) wb_get(wb) percpu_ref_get(&wb->refcnt) percpu_ref_get_many(ref, 1) atomic_long_add(nr, &ref->data->count) atomic64_add(i, v) // trigger null-ptr-deref bdi_split_work_to_wbs() traverses &bdi->wb_list to split work into all wbs. If the allocation of new work fails, the on-stack fallback will be used and the reference count of the current wb is increased afterwards. If cgroup writeback membership switches occur before getting the reference count and the current wb is released as old_wd, then calling wb_get() or wb_put() will trigger the null pointer dereference above. This issue was introduced in v4.3-rc7 (see fix tag1). Both sync_inodes_sb() and __writeback_inodes_sb_nr() calls to bdi_split_work_to_wbs() can trigger this issue. For scenarios called via sync_inodes_sb(), originally commit |
||
![]() |
0e8e989142 |
Merge 5.10.121 into android12-5.10-lts
Changes in 5.10.121 binfmt_flat: do not stop relocating GOT entries prematurely on riscv parisc/stifb: Implement fb_is_primary_device() riscv: Initialize thread pointer before calling C functions riscv: Fix irq_work when SMP is disabled ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15 9520 laptop ALSA: hda/realtek - Fix microphone noise on ASUS TUF B550M-PLUS ALSA: usb-audio: Cancel pending work at closing a MIDI substream USB: serial: option: add Quectel BG95 modem USB: new quirk for Dell Gen 2 devices usb: dwc3: gadget: Move null pinter check to proper place usb: core: hcd: Add support for deferring roothub registration cifs: when extending a file with falloc we should make files not-sparse xhci: Allow host runtime PM as default for Intel Alder Lake N xHCI Fonts: Make font size unsigned in font_desc parisc/stifb: Keep track of hardware path of graphics card x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails perf/x86/intel: Fix event constraints for ICL ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP ptrace: Reimplement PTRACE_KILL by always sending SIGKILL btrfs: add "0x" prefix for unsupported optional features btrfs: repair super block num_devices automatically iommu/vt-d: Add RPLS to quirk list to skip TE disabling drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes mwifiex: add mutex lock for call in mwifiex_dfs_chan_sw_work_queue b43legacy: Fix assigning negative value to unsigned variable b43: Fix assigning negative value to unsigned variable ipw2x00: Fix potential NULL dereference in libipw_xmit() ipv6: fix locking issues with loops over idev->addr_list fbcon: Consistently protect deferred_takeover with console_lock() x86/platform/uv: Update TSC sync state for UV5 ACPICA: Avoid cache flush inside virtual machines drm/komeda: return early if drm_universal_plane_init() fails. rcu-tasks: Fix race in schedule and flush work rcu: Make TASKS_RUDE_RCU select IRQ_WORK sfc: ef10: Fix assigning negative value to unsigned variable ALSA: jack: Access input_dev under mutex spi: spi-rspi: Remove setting {src,dst}_{addr,addr_width} based on DMA direction tools/power turbostat: fix ICX DRAM power numbers drm/amd/pm: fix double free in si_parse_power_table() ath9k: fix QCA9561 PA bias level media: venus: hfi: avoid null dereference in deinit media: pci: cx23885: Fix the error handling in cx23885_initdev() media: cx25821: Fix the warning when removing the module md/bitmap: don't set sb values if can't pass sanity check mmc: jz4740: Apply DMA engine limits to maximum segment size drivers: mmc: sdhci_am654: Add the quirk to set TESTCD bit scsi: megaraid: Fix error check return value of register_chrdev() scsi: ufs: Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync() scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp() ath11k: disable spectral scan during spectral deinit ASoC: Intel: bytcr_rt5640: Add quirk for the HP Pro Tablet 408 drm/plane: Move range check for format_count earlier drm/amd/pm: fix the compile warning ath10k: skip ath10k_halt during suspend for driver state RESTARTING arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall drm: msm: fix error check return value of irq_of_parse_and_map() ipv6: Don't send rs packets to the interface of ARPHRD_TUNNEL net/mlx5: fs, delete the FTE when there are no rules attached to it ASoC: dapm: Don't fold register value changes into notifications mlxsw: spectrum_dcb: Do not warn about priority changes mlxsw: Treat LLDP packets as control drm/amdgpu/ucode: Remove firmware load type check in amdgpu_ucode_free_bo HID: bigben: fix slab-out-of-bounds Write in bigben_probe ASoC: tscs454: Add endianness flag in snd_soc_component_driver net: remove two BUG() from skb_checksum_help() s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES perf/amd/ibs: Cascade pmu init functions' return value spi: stm32-qspi: Fix wait_cmd timeout in APM mode dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default ipmi:ssif: Check for NULL msg when handling events and messages ipmi: Fix pr_fmt to avoid compilation issues rtlwifi: Use pr_warn instead of WARN_ONCE media: rga: fix possible memory leak in rga_probe media: coda: limit frame interval enumeration to supported encoder frame sizes media: imon: reorganize serialization media: cec-adap.c: fix is_configuring state openrisc: start CPU timer early in boot nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags ASoC: rt5645: Fix errorenous cleanup order nbd: Fix hung on disconnect request if socket is closed before net: phy: micrel: Allow probing without .driver_data media: exynos4-is: Fix compile warning ASoC: max98357a: remove dependency on GPIOLIB ASoC: rt1015p: remove dependency on GPIOLIB can: mcp251xfd: silence clang's -Wunaligned-access warning x86/microcode: Add explicit CPU vendor dependency m68k: atari: Make Atari ROM port I/O write macros return void rxrpc: Return an error to sendmsg if call failed rxrpc, afs: Fix selection of abort codes eth: tg3: silence the GCC 12 array-bounds warning selftests/bpf: fix btf_dump/btf_dump due to recent clang change gfs2: use i_lock spin_lock for inode qadata IB/rdmavt: add missing locks in rvt_ruc_loopback ARM: dts: ox820: align interrupt controller node name with dtschema ARM: dts: s5pv210: align DMA channels with dtschema arm64: dts: qcom: msm8994: Fix BLSP[12]_DMA channels count PM / devfreq: rk3399_dmc: Disable edev on remove() crypto: ccree - use fine grained DMA mapping dir soc: ti: ti_sci_pm_domains: Check for null return of devm_kcalloc fs: jfs: fix possible NULL pointer dereference in dbFree() ARM: OMAP1: clock: Fix UART rate reporting algorithm powerpc/fadump: Fix fadump to work with a different endian capture kernel fat: add ratelimit to fat*_ent_bread() pinctrl: renesas: rzn1: Fix possible null-ptr-deref in sh_pfc_map_resources() ARM: versatile: Add missing of_node_put in dcscb_init ARM: dts: exynos: add atmel,24c128 fallback to Samsung EEPROM ARM: hisi: Add missing of_node_put after of_find_compatible_node PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store() tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate powerpc/powernv/vas: Assign real address to rx_fifo in vas_rx_win_attr powerpc/xics: fix refcount leak in icp_opal_init() powerpc/powernv: fix missing of_node_put in uv_init() macintosh/via-pmu: Fix build failure when CONFIG_INPUT is disabled powerpc/iommu: Add missing of_node_put in iommu_init_early_dart RDMA/hfi1: Prevent panic when SDMA is disabled drm: fix EDID struct for old ARM OABI format dt-bindings: display: sitronix, st7735r: Fix backlight in example ath11k: acquire ab->base_lock in unassign when finding the peer by addr ath9k: fix ar9003_get_eepmisc drm/edid: fix invalid EDID extension block filtering drm/bridge: adv7511: clean up CEC adapter when probe fails spi: qcom-qspi: Add minItems to interconnect-names ASoC: mediatek: Fix error handling in mt8173_max98090_dev_probe ASoC: mediatek: Fix missing of_node_put in mt2701_wm8960_machine_probe x86/delay: Fix the wrong asm constraint in delay_loop() drm/ingenic: Reset pixclock rate when parent clock rate changes drm/mediatek: Fix mtk_cec_mask() drm/vc4: hvs: Reset muxes at probe time drm/vc4: txp: Don't set TXP_VSTART_AT_EOF drm/vc4: txp: Force alpha to be 0xff if it's disabled libbpf: Don't error out on CO-RE relos for overriden weak subprogs bpf: Fix excessive memory allocation in stack_map_alloc() nl80211: show SSID for P2P_GO interfaces drm/komeda: Fix an undefined behavior bug in komeda_plane_add() drm: mali-dp: potential dereference of null pointer spi: spi-ti-qspi: Fix return value handling of wait_for_completion_timeout scftorture: Fix distribution of short handler delays net: dsa: mt7530: 1G can also support 1000BASE-X link mode NFC: NULL out the dev->rfkill to prevent UAF efi: Add missing prototype for efi_capsule_setup_info target: remove an incorrect unmap zeroes data deduction drbd: fix duplicate array initializer EDAC/dmc520: Don't print an error for each unconfigured interrupt line mtd: rawnand: denali: Use managed device resources HID: hid-led: fix maximum brightness for Dream Cheeky HID: elan: Fix potential double free in elan_input_configured drm/bridge: Fix error handling in analogix_dp_probe sched/fair: Fix cfs_rq_clock_pelt() for throttled cfs_rq spi: img-spfi: Fix pm_runtime_get_sync() error checking cpufreq: Fix possible race in cpufreq online error path ath9k_htc: fix potential out of bounds access with invalid rxstatus->rs_keyix media: hantro: Empty encoder capture buffers by default drm/panel: simple: Add missing bus flags for Innolux G070Y2-L01 ALSA: pcm: Check for null pointer of pointer substream before dereferencing it inotify: show inotify mask flags in proc fdinfo fsnotify: fix wrong lockdep annotations of: overlay: do not break notify on NOTIFY_{OK|STOP} drm/msm/dpu: adjust display_v_end for eDP and DP scsi: ufs: qcom: Fix ufs_qcom_resume() scsi: ufs: core: Exclude UECxx from SFR dump list selftests/resctrl: Fix null pointer dereference on open failed libbpf: Fix logic for finding matching program for CO-RE relocation mtd: spi-nor: core: Check written SR value in spi_nor_write_16bit_sr_and_check() x86/pm: Fix false positive kmemleak report in msr_build_context() mtd: rawnand: cadence: fix possible null-ptr-deref in cadence_nand_dt_probe() x86/speculation: Add missing prototype for unpriv_ebpf_notify() ASoC: rk3328: fix disabling mclk on pclk probe failure perf tools: Add missing headers needed by util/data.h drm/msm/disp/dpu1: set vbif hw config to NULL to avoid use after memory free during pm runtime resume drm/msm/dp: stop event kernel thread when DP unbind drm/msm/dp: fix error check return value of irq_of_parse_and_map() drm/msm/dsi: fix error checks and return values for DSI xmit functions drm/msm/hdmi: check return value after calling platform_get_resource_byname() drm/msm/hdmi: fix error check return value of irq_of_parse_and_map() drm/msm: add missing include to msm_drv.c drm/panel: panel-simple: Fix proper bpc for AM-1280800N3TZQW-T00H drm/rockchip: vop: fix possible null-ptr-deref in vop_bind() perf tools: Use Python devtools for version autodetection rather than runtime virtio_blk: fix the discard_granularity and discard_alignment queue limits x86: Fix return value of __setup handlers irqchip/exiu: Fix acknowledgment of edge triggered interrupts irqchip/aspeed-i2c-ic: Fix irq_of_parse_and_map() return value irqchip/aspeed-scu-ic: Fix irq_of_parse_and_map() return value x86/mm: Cleanup the control_va_addr_alignment() __setup handler arm64: fix types in copy_highpage() regulator: core: Fix enable_count imbalance with EXCLUSIVE_GET drm/msm/dp: fix event thread stuck in wait_event after kthread_stop() drm/msm/mdp5: Return error code in mdp5_pipe_release when deadlock is detected drm/msm/mdp5: Return error code in mdp5_mixer_release when deadlock is detected drm/msm: return an error pointer in msm_gem_prime_get_sg_table() media: uvcvideo: Fix missing check to determine if element is found in list iomap: iomap_write_failed fix spi: spi-fsl-qspi: check return value after calling platform_get_resource_byname() Revert "cpufreq: Fix possible race in cpufreq online error path" regulator: qcom_smd: Fix up PM8950 regulator configuration perf/amd/ibs: Use interrupt regs ip for stack unwinding ath11k: Don't check arvif->is_started before sending management frames ASoC: fsl: Fix refcount leak in imx_sgtl5000_probe ASoC: mxs-saif: Fix refcount leak in mxs_saif_probe regulator: pfuze100: Fix refcount leak in pfuze_parse_regulators_dt ASoC: samsung: Use dev_err_probe() helper ASoC: samsung: Fix refcount leak in aries_audio_probe kselftest/cgroup: fix test_stress.sh to use OUTPUT dir scripts/faddr2line: Fix overlapping text section failures media: aspeed: Fix an error handling path in aspeed_video_probe() media: exynos4-is: Fix PM disable depth imbalance in fimc_is_probe media: st-delta: Fix PM disable depth imbalance in delta_probe media: exynos4-is: Change clk_disable to clk_disable_unprepare media: pvrusb2: fix array-index-out-of-bounds in pvr2_i2c_core_init media: vsp1: Fix offset calculation for plane cropping Bluetooth: fix dangling sco_conn and use-after-free in sco_sock_timeout Bluetooth: Interleave with allowlist scan Bluetooth: L2CAP: Rudimentary typo fixes Bluetooth: LL privacy allow RPA Bluetooth: use inclusive language in HCI role comments Bluetooth: use inclusive language when filtering devices Bluetooth: use hdev lock for accept_list and reject_list in conn req nvme: set dma alignment to dword m68k: math-emu: Fix dependencies of math emulation support lsm,selinux: pass flowi_common instead of flowi to the LSM hooks sctp: read sk->sk_bound_dev_if once in sctp_rcv() net: hinic: add missing destroy_workqueue in hinic_pf_to_mgmt_init ASoC: ti: j721e-evm: Fix refcount leak in j721e_soc_probe_* media: ov7670: remove ov7670_power_off from ov7670_remove media: staging: media: rkvdec: Make use of the helper function devm_platform_ioremap_resource() media: rkvdec: h264: Fix dpb_valid implementation media: rkvdec: h264: Fix bit depth wrap in pps packet ext4: reject the 'commit' option on ext2 filesystems drm/msm/a6xx: Fix refcount leak in a6xx_gpu_init drm: msm: fix possible memory leak in mdp5_crtc_cursor_set() x86/sev: Annotate stack change in the #VC handler drm/msm/dpu: handle pm_runtime_get_sync() errors in bind path drm/i915: Fix CFI violation with show_dynamic_id() thermal/drivers/bcm2711: Don't clamp temperature at zero thermal/drivers/broadcom: Fix potential NULL dereference in sr_thermal_probe thermal/drivers/core: Use a char pointer for the cooling device name thermal/core: Fix memory leak in __thermal_cooling_device_register() thermal/drivers/imx_sc_thermal: Fix refcount leak in imx_sc_thermal_probe ASoC: wm2000: fix missing clk_disable_unprepare() on error in wm2000_anc_transition() NFC: hci: fix sleep in atomic context bugs in nfc_hci_hcp_message_tx ASoC: max98090: Move check for invalid values before casting in max98090_put_enab_tlv() net: stmmac: selftests: Use kcalloc() instead of kzalloc() net: stmmac: fix out-of-bounds access in a selftest hv_netvsc: Fix potential dereference of NULL pointer rxrpc: Fix listen() setting the bar too high for the prealloc rings rxrpc: Don't try to resend the request if we're receiving the reply rxrpc: Fix overlapping ACK accounting rxrpc: Don't let ack.previousPacket regress rxrpc: Fix decision on when to generate an IDLE ACK net: huawei: hinic: Use devm_kcalloc() instead of devm_kzalloc() hinic: Avoid some over memory allocation net/smc: postpone sk_refcnt increment in connect() arm64: dts: rockchip: Move drive-impedance-ohm to emmc phy on rk3399 memory: samsung: exynos5422-dmc: Avoid some over memory allocation ARM: dts: suniv: F1C100: fix watchdog compatible soc: qcom: smp2p: Fix missing of_node_put() in smp2p_parse_ipc soc: qcom: smsm: Fix missing of_node_put() in smsm_parse_ipc PCI: cadence: Fix find_first_zero_bit() limit PCI: rockchip: Fix find_first_zero_bit() limit PCI: dwc: Fix setting error return on MSI DMA mapping failure ARM: dts: ci4x10: Adapt to changes in imx6qdl.dtsi regarding fec clocks soc: qcom: llcc: Add MODULE_DEVICE_TABLE() KVM: nVMX: Leave most VM-Exit info fields unmodified on failed VM-Entry KVM: nVMX: Clear IDT vectoring on nested VM-Exit for double/triple fault platform/chrome: cros_ec: fix error handling in cros_ec_register() ARM: dts: imx6dl-colibri: Fix I2C pinmuxing platform/chrome: Re-introduce cros_ec_cmd_xfer and use it for ioctls can: xilinx_can: mark bit timing constants as const ARM: dts: stm32: Fix PHY post-reset delay on Avenger96 ARM: dts: bcm2835-rpi-zero-w: Fix GPIO line name for Wifi/BT ARM: dts: bcm2837-rpi-cm3-io3: Fix GPIO line names for SMPS I2C ARM: dts: bcm2837-rpi-3-b-plus: Fix GPIO line name of power LED ARM: dts: bcm2835-rpi-b: Fix GPIO line names misc: ocxl: fix possible double free in ocxl_file_register_afu crypto: marvell/cesa - ECB does not IV gpiolib: of: Introduce hook for missing gpio-ranges pinctrl: bcm2835: implement hook for missing gpio-ranges arm: mediatek: select arch timer for mt7629 powerpc/fadump: fix PT_LOAD segment for boot memory area mfd: ipaq-micro: Fix error check return value of platform_get_irq() scsi: fcoe: Fix Wstringop-overflow warnings in fcoe_wwn_from_mac() firmware: arm_scmi: Fix list protocols enumeration in the base protocol nvdimm: Fix firmware activation deadlock scenarios nvdimm: Allow overwrite in the presence of disabled dimms pinctrl: mvebu: Fix irq_of_parse_and_map() return value drivers/base/node.c: fix compaction sysfs file leak dax: fix cache flush on PMD-mapped pages drivers/base/memory: fix an unlikely reference counting issue in __add_memory_block() powerpc/8xx: export 'cpm_setbrg' for modules pinctrl: renesas: core: Fix possible null-ptr-deref in sh_pfc_map_resources() powerpc/idle: Fix return value of __setup() handler powerpc/4xx/cpm: Fix return value of __setup() handler ASoC: atmel-pdmic: Remove endianness flag on pdmic component ASoC: atmel-classd: Remove endianness flag on class d component proc: fix dentry/inode overinstantiating under /proc/${pid}/net ipc/mqueue: use get_tree_nodev() in mqueue_get_tree() PCI: imx6: Fix PERST# start-up sequence tty: fix deadlock caused by calling printk() under tty_port->lock crypto: sun8i-ss - rework handling of IV crypto: sun8i-ss - handle zero sized sg crypto: cryptd - Protect per-CPU resource by disabling BH. Input: sparcspkr - fix refcount leak in bbc_beep_probe PCI/AER: Clear MULTI_ERR_COR/UNCOR_RCV bits hwrng: omap3-rom - fix using wrong clk_disable() in omap_rom_rng_runtime_resume() powerpc/64: Only WARN if __pa()/__va() called with bad addresses powerpc/perf: Fix the threshold compare group constraint for power9 macintosh: via-pmu and via-cuda need RTC_LIB powerpc/fsl_rio: Fix refcount leak in fsl_rio_setup mfd: davinci_voicecodec: Fix possible null-ptr-deref davinci_vc_probe() mailbox: forward the hrtimer if not queued and under a lock RDMA/hfi1: Prevent use of lock before it is initialized Input: stmfts - do not leave device disabled in stmfts_input_open OPP: call of_node_put() on error path in _bandwidth_supported() f2fs: fix dereference of stale list iterator after loop body iommu/mediatek: Add list_del in mtk_iommu_remove i2c: at91: use dma safe buffers cpufreq: mediatek: add missing platform_driver_unregister() on error in mtk_cpufreq_driver_init cpufreq: mediatek: Use module_init and add module_exit cpufreq: mediatek: Unregister platform device on exit MIPS: Loongson: Use hwmon_device_register_with_groups() to register hwmon i2c: at91: Initialize dma_buf in at91_twi_xfer() dmaengine: idxd: Fix the error handling path in idxd_cdev_register() NFS: Do not report EINTR/ERESTARTSYS as mapping errors NFS: fsync() should report filesystem errors over EINTR/ERESTARTSYS NFS: Do not report flush errors in nfs_write_end() NFS: Don't report errors from nfs_pageio_complete() more than once NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout video: fbdev: clcdfb: Fix refcount leak in clcdfb_of_vram_setup dmaengine: stm32-mdma: remove GISR1 register dmaengine: stm32-mdma: rework interrupt handler dmaengine: stm32-mdma: fix chan initialization in stm32_mdma_irq_handler() iommu/amd: Increase timeout waiting for GA log enablement i2c: npcm: Fix timeout calculation i2c: npcm: Correct register access width i2c: npcm: Handle spurious interrupts i2c: rcar: fix PM ref counts in probe error paths perf c2c: Use stdio interface if slang is not supported perf jevents: Fix event syntax error caused by ExtSel f2fs: fix to avoid f2fs_bug_on() in dec_valid_node_count() f2fs: fix to do sanity check on block address in f2fs_do_zero_range() f2fs: fix to clear dirty inode in f2fs_evict_inode() f2fs: fix deadloop in foreground GC f2fs: don't need inode lock for system hidden quota f2fs: fix to do sanity check on total_data_blocks f2fs: fix fallocate to use file_modified to update permissions consistently f2fs: fix to do sanity check for inline inode wifi: mac80211: fix use-after-free in chanctx code iwlwifi: mvm: fix assert 1F04 upon reconfig fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages efi: Do not import certificates from UEFI Secure Boot for T2 Macs bfq: Split shared queues on move between cgroups bfq: Update cgroup information before merging bio bfq: Track whether bfq_group is still online ext4: fix use-after-free in ext4_rename_dir_prepare ext4: fix warning in ext4_handle_inode_extension ext4: fix bug_on in ext4_writepages ext4: filter out EXT4_FC_REPLAY from on-disk superblock field s_state ext4: fix bug_on in __es_tree_search ext4: verify dir block before splitting it ext4: avoid cycles in directory h-tree ACPI: property: Release subnode properties with data nodes tracing: Fix potential double free in create_var_ref() PCI/PM: Fix bridge_d3_blacklist[] Elo i2 overwrite of Gigabyte X299 PCI: qcom: Fix runtime PM imbalance on probe errors PCI: qcom: Fix unbalanced PHY init on probe errors mm, compaction: fast_find_migrateblock() should return pfn in the target zone s390/perf: obtain sie_block from the right address dlm: fix plock invalid read dlm: fix missing lkb refcount handling ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock scsi: dc395x: Fix a missing check on list iterator scsi: ufs: qcom: Add a readl() to make sure ref_clk gets enabled drm/amdgpu/cs: make commands with 0 chunks illegal behaviour. drm/etnaviv: check for reaped mapping in etnaviv_iommu_unmap_gem drm/nouveau/clk: Fix an incorrect NULL check on list iterator drm/nouveau/kms/nv50-: atom: fix an incorrect NULL check on list iterator drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX drm/i915/dsi: fix VBT send packet port selection for ICL+ md: fix an incorrect NULL check in does_sb_need_changing md: fix an incorrect NULL check in md_reload_sb mtd: cfi_cmdset_0002: Move and rename chip_check/chip_ready/chip_good_for_write mtd: cfi_cmdset_0002: Use chip_ready() for write on S29GL064N media: coda: Fix reported H264 profile media: coda: Add more H264 levels for CODA960 ima: remove the IMA_TEMPLATE Kconfig option Kconfig: Add option for asm goto w/ tied outputs to workaround clang-13 bug RDMA/hfi1: Fix potential integer multiplication overflow errors csky: patch_text: Fixup last cpu should be master irqchip/armada-370-xp: Do not touch Performance Counter Overflow on A375, A38x, A39x irqchip: irq-xtensa-mx: fix initial IRQ affinity cfg80211: declare MODULE_FIRMWARE for regulatory.db mac80211: upgrade passive scan to active scan on DFS channels after beacon rx um: chan_user: Fix winch_tramp() return value um: Fix out-of-bounds read in LDT setup kexec_file: drop weak attribute from arch_kexec_apply_relocations[_add] ftrace: Clean up hash direct_functions on register failures iommu/msm: Fix an incorrect NULL check on list iterator nodemask.h: fix compilation error with GCC12 hugetlb: fix huge_pmd_unshare address update xtensa/simdisk: fix proc_read_simdisk() rtl818x: Prevent using not initialized queues ASoC: rt5514: Fix event generation for "DSP Voice Wake Up" control carl9170: tx: fix an incorrect use of list iterator stm: ltdc: fix two incorrect NULL checks on list iterator bcache: improve multithreaded bch_btree_check() bcache: improve multithreaded bch_sectors_dirty_init() bcache: remove incremental dirty sector counting for bch_sectors_dirty_init() bcache: avoid journal no-space deadlock by reserving 1 journal bucket serial: pch: don't overwrite xmit->buf[0] by x_char tilcdc: tilcdc_external: fix an incorrect NULL check on list iterator gma500: fix an incorrect NULL check on list iterator arm64: dts: qcom: ipq8074: fix the sleep clock frequency phy: qcom-qmp: fix struct clk leak on probe errors ARM: dts: s5pv210: Remove spi-cs-high on panel in Aries ARM: pxa: maybe fix gpio lookup tables SMB3: EBADF/EIO errors in rename/open caused by race condition in smb2_compound_op docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0 dt-bindings: gpio: altera: correct interrupt-cells vdpasim: allow to enable a vq repeatedly blk-iolatency: Fix inflight count imbalances and IO hangs on offline coresight: core: Fix coresight device probe failure issue phy: qcom-qmp: fix reset-controller leak on probe errors net: ipa: fix page free in ipa_endpoint_trans_release() net: ipa: fix page free in ipa_endpoint_replenish_one() xfs: set inode size after creating symlink xfs: sync lazy sb accounting on quiesce of read-only mounts xfs: fix chown leaking delalloc quota blocks when fssetxattr fails xfs: fix incorrect root dquot corruption error when switching group/project quota types xfs: restore shutdown check in mapped write fault path xfs: force log and push AIL to clear pinned inodes when aborting mount xfs: consider shutdown in bmapbt cursor delete assert xfs: assert in xfs_btree_del_cursor should take into account error kseltest/cgroup: Make test_stress.sh work if run interactively thermal/core: fix a UAF bug in __thermal_cooling_device_register() thermal/core: Fix memory leak in the error path bfq: Avoid merging queues with different parents bfq: Drop pointless unlock-lock pair bfq: Remove pointless bfq_init_rq() calls bfq: Get rid of __bio_blkcg() usage bfq: Make sure bfqg for which we are queueing requests is online block: fix bio_clone_blkg_association() to associate with proper blkcg_gq Revert "random: use static branch for crng_ready()" RDMA/rxe: Generate a completion for unsupported/invalid opcode MIPS: IP27: Remove incorrect `cpu_has_fpu' override MIPS: IP30: Remove incorrect `cpu_has_fpu' override ext4: only allow test_dummy_encryption when supported md: bcache: check the return value of kzalloc() in detached_dev_do_request() Linux 5.10.121 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I52dd11dc43acfa0ebddd2b6e277c823b96b07327 |
||
![]() |
9a9dc60da7 |
fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages
commit 68f4c6eba70df70a720188bce95c85570ddfcc87 upstream. Commit |
||
![]() |
d483eed85f |
ANDROID: GKI: set vfs-only exports into their own namespace
We have namespaces, so use them for all vfs-exported namespaces so that filesystems can use them, but not anything else. Some in-kernel drivers that do direct filesystem accesses (because they serve up files) are also allowed access to these symbols to keep 'make allmodconfig' builds working properly, but it is not needed for Android kernel images. Bug: 157965270 Bug: 210074446 Cc: Matthias Maennich <maennich@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iaf6140baf3a18a516ab2d5c3966235c42f3f70de |
||
![]() |
6939c39a41 |
writeback: fix obtain a reference to a freeing memcg css
[ Upstream commit 8b0ed8443ae6458786580d36b7d5f8125535c5d4 ]
The caller of wb_get_create() should pin the memcg, because
wb_get_create() relies on this guarantee. The rcu read lock
only can guarantee that the memcg css returned by css_from_id()
cannot be released, but the reference of the memcg can be zero.
rcu_read_lock()
memcg_css = css_from_id()
wb_get_create(memcg_css)
cgwb_create(memcg_css)
// css_get can change the ref counter from 0 back to 1
css_get(memcg_css)
rcu_read_unlock()
Fix it by holding a reference to the css before calling
wb_get_create(). This is not a problem I encountered in the
real world. Just the result of a code review.
Fixes:
|
||
![]() |
0c1d1517d6 |
writeback, cgroup: increment isw_nr_in_flight before grabbing an inode
[ Upstream commit 8826ee4fe75051f8cbfa5d4a9aa70565938e724c ] isw_nr_in_flight is used to determine whether the inode switch queue should be flushed from the umount path. Currently it's increased after grabbing an inode and even scheduling the switch work. It means the umount path can walk past cleanup_offline_cgwb() with active inode references, which can result in a "Busy inodes after unmount." message and use-after-free issues (with inode->i_sb which gets freed). Fix it by incrementing isw_nr_in_flight before doing anything with the inode and decrementing in the case when switching wasn't scheduled. The problem hasn't yet been seen in the real life and was discovered by Jan Kara by looking into the code. Link: https://lkml.kernel.org/r/20210608230225.2078447-4-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Suggested-by: Jan Kara <jack@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dave Chinner <dchinner@redhat.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
![]() |
f58625bf2c |
block_dump: remove block_dump feature in mark_inode_dirty()
[ Upstream commit 12e0613715e1cf305fffafaf0e89d810d9a85cc0 ] block_dump is an old debugging interface, one of it's functions is used to print the information about who write which file on disk. If we enable block_dump through /proc/sys/vm/block_dump and turn on debug log level, we can gather information about write process name, target file name and disk from kernel message. This feature is realized in block_dump___mark_inode_dirty(), it print above information into kernel message directly when marking inode dirty, so it is noisy and can easily trigger log storm. At the same time, get the dentry refcount is also not safe, we found it will lead to deadlock on ext4 file system with data=journal mode. After tracepoints has been introduced into the kernel, we got a tracepoint in __mark_inode_dirty(), which is a better replacement of block_dump___mark_inode_dirty(). The only downside is that it only trace the inode number and not a file name, but it probably doesn't matter because the original printed file name in block_dump is not accurate in some cases, and we can still find it through the inode number and device id. So this patch delete the dirting inode part of block_dump feature. Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210313030146.2882027-2-yi.zhang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
![]() |
13ef6bccab |
fs: fix lazytime expiration handling in __writeback_single_inode()
commit 1e249cb5b7fc09ff216aa5a12f6c302e434e88f9 upstream. When lazytime is enabled and an inode is being written due to its in-memory updated timestamps having expired, either due to a sync() or syncfs() system call or due to dirtytime_expire_interval having elapsed, the VFS needs to inform the filesystem so that the filesystem can copy the inode's timestamps out to the on-disk data structures. This is done by __writeback_single_inode() calling mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC). However, this occurs after __writeback_single_inode() has already cleared the dirty flags from ->i_state. This causes two bugs: - mark_inode_dirty_sync() redirties the inode, causing it to remain dirty. This wastefully causes the inode to be written twice. But more importantly, it breaks cases where sync_filesystem() is expected to clean dirty inodes. This includes the FS_IOC_REMOVE_ENCRYPTION_KEY ioctl (as reported at https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well as possibly filesystem freezing (freeze_super()). - Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is called from __writeback_single_inode() for lazytime expiration, xfs_fs_dirty_inode() ignores the notification. (XFS only cares about lazytime expirations, and it assumes that i_state will contain I_DIRTY_TIME during those.) Therefore, lazy timestamps aren't persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS. Fix this by moving the call to mark_inode_dirty_sync() to earlier in __writeback_single_inode(), before the dirty flags are cleared from i_state. This makes filesystems be properly notified of the timestamp expiration, and it avoids incorrectly redirtying the inode. This fixes xfstest generic/580 (which tests FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime enabled. It also fixes the new lazytime xfstest I've proposed, which reproduces the above-mentioned XFS bug (https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org). Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly. But due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the right thing to do because mark_inode_dirty_sync() now knows not to move the inode to a writeback list if it is currently queued for sync. Fixes: |
||
![]() |
3ad11d7ac8 |
Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe: - Series of merge handling cleanups (Baolin, Christoph) - Series of blk-throttle fixes and cleanups (Baolin) - Series cleaning up BDI, seperating the block device from the backing_dev_info (Christoph) - Removal of bdget() as a generic API (Christoph) - Removal of blkdev_get() as a generic API (Christoph) - Cleanup of is-partition checks (Christoph) - Series reworking disk revalidation (Christoph) - Series cleaning up bio flags (Christoph) - bio crypt fixes (Eric) - IO stats inflight tweak (Gabriel) - blk-mq tags fixes (Hannes) - Buffer invalidation fixes (Jan) - Allow soft limits for zone append (Johannes) - Shared tag set improvements (John, Kashyap) - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel) - DM no-wait support (Mike, Konstantin) - Request allocation improvements (Ming) - Allow md/dm/bcache to use IO stat helpers (Song) - Series improving blk-iocost (Tejun) - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang, Xianting, Yang, Yufen, yangerkun) * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits) block: fix uapi blkzoned.h comments blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue blk-mq: get rid of the dead flush handle code path block: get rid of unnecessary local variable block: fix comment and add lockdep assert blk-mq: use helper function to test hw stopped block: use helper function to test queue register block: remove redundant mq check block: invoke blk_mq_exit_sched no matter whether have .exit_sched percpu_ref: don't refer to ref->data if it isn't allocated block: ratelimit handle_bad_sector() message blk-throttle: Re-use the throtl_set_slice_end() blk-throttle: Open code __throtl_de/enqueue_tg() blk-throttle: Move service tree validation out of the throtl_rb_first() blk-throttle: Move the list operation after list validation blk-throttle: Fix IO hang for a corner case blk-throttle: Avoid tracking latency if low limit is invalid blk-throttle: Avoid getting the current time if tg->last_finish_time is 0 blk-throttle: Remove a meaningless parameter for throtl_downgrade_state() block: Remove redundant 'return' statement ... |
||
![]() |
f56753ac2a |
bdi: replace BDI_CAP_NO_{WRITEBACK,ACCT_DIRTY} with a single flag
Replace the two negative flags that are always used together with a single positive flag that indicates the writeback capability instead of two related non-capabilities. Also remove the pointless wrappers to just check the flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
9ca48e20ec |
fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype
Commit |
||
![]() |
5fcd57505c |
writeback: Drop I_DIRTY_TIME_EXPIRE
The only use of I_DIRTY_TIME_EXPIRE is to detect in __writeback_single_inode() that inode got there because flush worker decided it's time to writeback the dirty inode time stamps (either because we are syncing or because of age). However we can detect this directly in __writeback_single_inode() and there's no need for the strange propagation with I_DIRTY_TIME_EXPIRE flag. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> |
||
![]() |
f9cae926f3 |
writeback: Fix sync livelock due to b_dirty_time processing
When we are processing writeback for sync(2), move_expired_inodes()
didn't set any inode expiry value (older_than_this). This can result in
writeback never completing if there's steady stream of inodes added to
b_dirty_time list as writeback rechecks dirty lists after each writeback
round whether there's more work to be done. Fix the problem by using
sync(2) start time is inode expiry value when processing b_dirty_time
list similarly as for ordinarily dirtied inodes. This requires some
refactoring of older_than_this handling which simplifies the code
noticeably as a bonus.
Fixes:
|
||
![]() |
5afced3bf2 |
writeback: Avoid skipping inode writeback
Inode's i_io_list list head is used to attach inode to several different
lists - wb->{b_dirty, b_dirty_time, b_io, b_more_io}. When flush worker
prepares a list of inodes to writeback e.g. for sync(2), it moves inodes
to b_io list. Thus it is critical for sync(2) data integrity guarantees
that inode is not requeued to any other writeback list when inode is
queued for processing by flush worker. That's the reason why
writeback_single_inode() does not touch i_io_list (unless the inode is
completely clean) and why __mark_inode_dirty() does not touch i_io_list
if I_SYNC flag is set.
However there are two flaws in the current logic:
1) When inode has only I_DIRTY_TIME set but it is already queued in b_io
list due to sync(2), concurrent __mark_inode_dirty(inode, I_DIRTY_SYNC)
can still move inode back to b_dirty list resulting in skipping
writeback of inode time stamps during sync(2).
2) When inode is on b_dirty_time list and writeback_single_inode() races
with __mark_inode_dirty() like:
writeback_single_inode() __mark_inode_dirty(inode, I_DIRTY_PAGES)
inode->i_state |= I_SYNC
__writeback_single_inode()
inode->i_state |= I_DIRTY_PAGES;
if (inode->i_state & I_SYNC)
bail
if (!(inode->i_state & I_DIRTY_ALL))
- not true so nothing done
We end up with I_DIRTY_PAGES inode on b_dirty_time list and thus
standard background writeback will not writeback this inode leading to
possible dirty throttling stalls etc. (thanks to Martijn Coenen for this
analysis).
Fix these problems by tracking whether inode is queued in b_io or
b_more_io lists in a new I_SYNC_QUEUED flag. When this flag is set, we
know flush worker has queued inode and we should not touch i_io_list.
On the other hand we also know that once flush worker is done with the
inode it will requeue the inode to appropriate dirty list. When
I_SYNC_QUEUED is not set, __mark_inode_dirty() can (and must) move inode
to appropriate dirty list.
Reported-by: Martijn Coenen <maco@android.com>
Reviewed-by: Martijn Coenen <maco@android.com>
Tested-by: Martijn Coenen <maco@android.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Fixes:
|
||
![]() |
b35250c081 |
writeback: Protect inode->i_io_list with inode->i_lock
Currently, operations on inode->i_io_list are protected by wb->list_lock. In the following patches we'll need to maintain consistency between inode->i_state and inode->i_io_list so change the code so that inode->i_lock protects also all inode's i_io_list handling. Reviewed-by: Martijn Coenen <maco@android.com> Reviewed-by: Christoph Hellwig <hch@lst.de> CC: stable@vger.kernel.org # Prerequisite for "writeback: Avoid skipping inode writeback" Signed-off-by: Jan Kara <jack@suse.cz> |
||
![]() |
0b166a57e6 |
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o: "A lot of bug fixes and cleanups for ext4, including: - Fix performance problems found in dioread_nolock now that it is the default, caused by transaction leaks. - Clean up fiemap handling in ext4 - Clean up and refactor multiple block allocator (mballoc) code - Fix a problem with mballoc with a smaller file systems running out of blocks because they couldn't properly use blocks that had been reserved by inode preallocation. - Fixed a race in ext4_sync_parent() versus rename() - Simplify the error handling in the extent manipulation code - Make sure all metadata I/O errors are felected to ext4_ext_dirty()'s and ext4_make_inode_dirty()'s callers. - Avoid passing an error pointer to brelse in ext4_xattr_set() - Fix race which could result to freeing an inode on the dirty last in data=journal mode. - Fix refcount handling if ext4_iget() fails - Fix a crash in generic/019 caused by a corrupted extent node" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits) ext4: avoid unnecessary transaction starts during writeback ext4: don't block for O_DIRECT if IOCB_NOWAIT is set ext4: remove the access_ok() check in ext4_ioctl_get_es_cache fs: remove the access_ok() check in ioctl_fiemap fs: handle FIEMAP_FLAG_SYNC in fiemap_prep fs: move fiemap range validation into the file systems instances iomap: fix the iomap_fiemap prototype fs: move the fiemap definitions out of fs.h fs: mark __generic_block_fiemap static ext4: remove the call to fiemap_check_flags in ext4_fiemap ext4: split _ext4_fiemap ext4: fix fiemap size checks for bitmap files ext4: fix EXT4_MAX_LOGICAL_BLOCK macro add comment for ext4_dir_entry_2 file_type member jbd2: avoid leaking transaction credits when unreserving handle ext4: drop ext4_journal_free_reserved() ext4: mballoc: use lock for checking free blocks while retrying ext4: mballoc: refactor ext4_mb_good_group() ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling ext4: mballoc: refactor ext4_mb_discard_preallocations() ... |
||
![]() |
4301efa4c7 |
writeback: Export inode_io_list_del()
Ext4 needs to remove inode from writeback lists after it is out of visibility of its journalling machinery (which can still dirty the inode). Export inode_io_list_del() for it. Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200421085445.5731-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> |
||
![]() |
750a02ab8d |
Merge tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe: "Core block changes that have been queued up for this release: - Remove dead blk-throttle and blk-wbt code (Guoqing) - Include pid in blktrace note traces (Jan) - Don't spew I/O errors on wouldblock termination (me) - Zone append addition (Johannes, Keith, Damien) - IO accounting improvements (Konstantin, Christoph) - blk-mq hardware map update improvements (Ming) - Scheduler dispatch improvement (Salman) - Inline block encryption support (Satya) - Request map fixes and improvements (Weiping) - blk-iocost tweaks (Tejun) - Fix for timeout failing with error injection (Keith) - Queue re-run fixes (Douglas) - CPU hotplug improvements (Christoph) - Queue entry/exit improvements (Christoph) - Move DMA drain handling to the few drivers that use it (Christoph) - Partition handling cleanups (Christoph)" * tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block: (127 commits) block: mark bio_wouldblock_error() bio with BIO_QUIET blk-wbt: rename __wbt_update_limits to wbt_update_limits blk-wbt: remove wbt_update_limits blk-throttle: remove tg_drain_bios blk-throttle: remove blk_throtl_drain null_blk: force complete for timeout request blk-mq: drain I/O when all CPUs in a hctx are offline blk-mq: add blk_mq_all_tag_iter blk-mq: open code __blk_mq_alloc_request in blk_mq_alloc_request_hctx blk-mq: use BLK_MQ_NO_TAG in more places blk-mq: rename BLK_MQ_TAG_FAIL to BLK_MQ_NO_TAG blk-mq: move more request initialization to blk_mq_rq_ctx_init blk-mq: simplify the blk_mq_get_request calling convention blk-mq: remove the bio argument to ->prepare_request nvme: force complete cancelled requests blk-mq: blk-mq: provide forced completion method block: fix a warning when blkdev.h is included for !CONFIG_BLOCK builds block: blk-crypto-fallback: remove redundant initialization of variable err block: reduce part_stat_lock() scope block: use __this_cpu_add() instead of access by smp_processor_id() ... |
||
![]() |
8d92890bd6 |
mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead
After an NFS page has been written it is considered "unstable" until a
COMMIT request succeeds. If the COMMIT fails, the page will be
re-written.
These "unstable" pages are currently accounted as "reclaimable", either
in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a
'reclaimable' count. This might have made sense when sending the COMMIT
required a separate action by the VFS/MM (e.g. releasepage() used to
send a COMMIT). However now that all writes generated by ->writepages()
will automatically be followed by a COMMIT (since commit
|
||
![]() |
1cd925d583 |
bdi: remove the name field in struct backing_dev_info
The name is only printed for a not registered bdi in writeback. Use the device name there as is more useful anyway for the unlike case that the warning triggers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
68f23b8906 |
memcg: fix a crash in wb_workfn when a device disappears
Without memcg, there is a one-to-one mapping between the bdi and bdi_writeback structures. In this world, things are fairly straightforward; the first thing bdi_unregister() does is to shutdown the bdi_writeback structure (or wb), and part of that writeback ensures that no other work queued against the wb, and that the wb is fully drained. With memcg, however, there is a one-to-many relationship between the bdi and bdi_writeback structures; that is, there are multiple wb objects which can all point to a single bdi. There is a refcount which prevents the bdi object from being released (and hence, unregistered). So in theory, the bdi_unregister() *should* only get called once its refcount goes to zero (bdi_put will drop the refcount, and when it is zero, release_bdi gets called, which calls bdi_unregister). Unfortunately, del_gendisk() in block/gen_hd.c never got the memo about the Brave New memcg World, and calls bdi_unregister directly. It does this without informing the file system, or the memcg code, or anything else. This causes the root wb associated with the bdi to be unregistered, but none of the memcg-specific wb's are shutdown. So when one of these wb's are woken up to do delayed work, they try to dereference their wb->bdi->dev to fetch the device name, but unfortunately bdi->dev is now NULL, thanks to the bdi_unregister() called by del_gendisk(). As a result, *boom*. Fortunately, it looks like the rest of the writeback path is perfectly happy with bdi->dev and bdi->owner being NULL, so the simplest fix is to create a bdi_dev_name() function which can handle bdi->dev being NULL. This also allows us to bulletproof the writeback tracepoints to prevent them from dereferencing a NULL pointer and crashing the kernel if one is tracing with memcg's enabled, and an iSCSI device dies or a USB storage stick is pulled. The most common way of triggering this will be hotremoval of a device while writeback with memcg enabled is going on. It was triggering several times a day in a heavily loaded production environment. Google Bug Id: 145475544 Link: https://lore.kernel.org/r/20191227194829.150110-1-tytso@mit.edu Link: http://lkml.kernel.org/r/20191228005211.163952-1-tytso@mit.edu Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Chris Mason <clm@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
![]() |
65de03e251 |
cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead
cgroup writeback tries to refresh the associated wb immediately if the current wb is dead. This is to avoid keeping issuing IOs on the stale wb after memcg - blkcg association has changed (ie. when blkcg got disabled / enabled higher up in the hierarchy). Unfortunately, the logic gets triggered spuriously on inodes which are associated with dead cgroups. When the logic is triggered on dead cgroups, the attempt fails only after doing quite a bit of work allocating and initializing a new wb. While |
||
![]() |
b46ec1da5e |
fs/fs-writeback.c: fix kernel-doc warning
Fix kernel-doc warning in fs/fs-writeback.c:
fs/fs-writeback.c:913: warning: Excess function parameter 'nr_pages' description in 'cgroup_writeback_by_id'
Link: http://lkml.kernel.org/r/756645ac-0ce8-d47e-d30a-04d9e4923a4f@infradead.org
Fixes:
|
||
![]() |
8e00c4e9dd |
writeback: fix use-after-free in finish_writeback_work()
finish_writeback_work() reads @done->waitq after decrementing
@done->cnt. However, once @done->cnt reaches zero, @done may be freed
(from stack) at any moment and @done->waitq can contain something
unrelated by the time finish_writeback_work() tries to read it. This
led to the following crash.
"BUG: kernel NULL pointer dereference, address: 0000000000000002"
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
CPU: 40 PID: 555153 Comm: kworker/u98:50 Kdump: loaded Not tainted
...
Workqueue: writeback wb_workfn (flush-btrfs-1)
RIP: 0010:_raw_spin_lock_irqsave+0x10/0x30
Code: 48 89 d8 5b c3 e8 50 db 6b ff eb f4 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 53 9c 5b fa 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 05 48 89 d8 5b c3 89 c6 e8 fe ca 6b ff eb f2 66 90
RSP: 0018:ffffc90049b27d98 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000003 RDI: 0000000000000002
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
R10: ffff889fff407600 R11: ffff88ba9395d740 R12: 000000000000e300
R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88bfdfa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000002 CR3: 0000000002409005 CR4: 00000000001606e0
Call Trace:
__wake_up_common_lock+0x63/0xc0
wb_workfn+0xd2/0x3e0
process_one_work+0x1f5/0x3f0
worker_thread+0x2d/0x3d0
kthread+0x111/0x130
ret_from_fork+0x1f/0x30
Fix it by reading and caching @done->waitq before decrementing
@done->cnt.
Link: http://lkml.kernel.org/r/20190924010631.GH2233839@devbig004.ftw2.facebook.com
Fixes:
|
||
![]() |
3a8e9ac89e |
writeback: add tracepoints for cgroup foreign writebacks
cgroup foreign inode handling has quite a bit of heuristics and internal states which sometimes makes it difficult to understand what's going on. Add tracepoints to improve visibility. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
d62241c7a4 |
writeback, memcg: Implement cgroup_writeback_by_id()
Implement cgroup_writeback_by_id() which initiates cgroup writeback from bdi and memcg IDs. This will be used by memcg foreign inode flushing. v2: Use wb_get_lookup() instead of wb_get_create() to avoid creating spurious wbs. v3: Interpret 0 @nr as 1.25 * nr_dirty to implement best-effort flushing while avoding possible livelocks. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
5b9cce4c7e |
writeback: Generalize and expose wb_completion
wb_completion is used to track writeback completions. We want to use it from memcg side for foreign inode flushes. This patch updates it to remember the target waitq instead of assuming bdi->wb_waitq and expose it outside of fs-writeback.c. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
6444f47eb8 |
writeback, cgroup: inode_switch_wbs() shouldn't give up on wb_switch_rwsem trylock fail
As inode wb switching may make sync(2) miss some inodes, they're synchronized using wb_switch_rwsem so that no wb switching happens while sync(2) is in progress. In addition to synchronizing the actual switching, the rwsem is also used to prevent queueing new switch attempts while sync(2) is in progress. This is to avoid queueing too many instances while the rwsem is held by sync(2). Unfortunately, this is too agressive and can block wb switching for a long time if sync(2) is frequent. The goal is avoiding expolding the number of scheduled switches, not avoiding scheduling anything. Let's use wb_switch_rwsem only for synchronizing the actual switching and sync(2) and use isw_nr_in_flight instead for limiting the maximum number of scheduled switches. The limit is set to 1024 which should be more than enough while still avoiding extreme situations. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
55a694dffb |
writeback, cgroup: Adjust WB_FRN_TIME_CUT_DIV to accelerate foreign inode switching
WB_FRN_TIME_CUT_DIV is used to tell the foreign inode detection logic to ignore short writeback rounds to prevent getting confused by a burst of short writebacks. The parameter is currently 2 meaning that anything smaller than half of the running average writback duration will be ignored. This is unnecessarily aggressive. The detection logic uses 16 history slots and is already reasonably protected against some short bursts confusing it and the current parameter can lead to tens of seconds of missed detection depending on the writeback pattern. Let's change the parameter to 8, so that it only ignores writeback with are smaller than 12.5% of the current running average. v2: Add comment explaining what's going on with the foreign detection parameters. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
27b36d8fa8 |
blkcg, writeback: Add wbc->no_cgroup_owner
When writeback IOs are bounced through async layers, the IOs should only be accounted against the wbc from the original bdi writeback to avoid confusing cgroup inode ownership arbitration. Add wbc->no_cgroup_owner to allow disabling wbc cgroup owner accounting. This will be used make btrfs compression work well with cgroup IO control. v2: Renamed from no_wbc_acct to no_cgroup_owner and added comment as per Jan. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
34e51a5e1a |
blkcg, writeback: Rename wbc_account_io() to wbc_account_cgroup_owner()
wbc_account_io() does a very specific job - try to see which cgroup is actually dirtying an inode and transfer its ownership to the majority dirtier if needed. The name is too generic and confusing. Let's rename it to something more specific. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
9b0eb69b75 |
cgroup, blkcg: Prepare some symbols for module and !CONFIG_CGROUP usages
btrfs is going to use css_put() and wbc helpers to improve cgroup writeback support. Add dummy css_get() definition and export wbc helpers to prepare for module and !CONFIG_CGROUP builds. Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
6631142229 |
blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration
wbc_account_io() collects information on cgroup ownership of writeback pages to determine which cgroup should own the inode. Pages can stay associated with dead memcgs but we want to avoid attributing IOs to dead blkcgs as much as possible as the association is likely to be stale. However, currently, pages associated with dead memcgs contribute to the accounting delaying and/or confusing the arbitration. Fix it by ignoring pages associated with dead memcgs. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
457c899653 |
treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
ec084de929 |
fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
synchronize_rcu() didn't wait for call_rcu() callbacks, so inode wb switch may not go to the workqueue after synchronize_rcu(). Thus previous scheduled switches was not finished even flushing the workqueue, which will cause a NULL pointer dereferenced followed below. VFS: Busy inodes after unmount of vdd. Self-destruct in 5 seconds. Have a nice day... BUG: unable to handle kernel NULL pointer dereference at 0000000000000278 evict+0xb3/0x180 iput+0x1b0/0x230 inode_switch_wbs_work_fn+0x3c0/0x6a0 worker_thread+0x4e/0x490 ? process_one_work+0x410/0x410 kthread+0xe6/0x100 ret_from_fork+0x39/0x50 Replace the synchronize_rcu() call with a rcu_barrier() to wait for all pending callbacks to finish. And inc isw_nr_in_flight after call_rcu() in inode_switch_wbs() to make more sense. Link: http://lkml.kernel.org/r/20190429024108.54150-1-jiufei.xue@linux.alibaba.com Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com> Acked-by: Tejun Heo <tj@kernel.org> Suggested-by: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
![]() |
7fc5854f8c |
writeback: synchronize sync(2) against cgroup writeback membership switches
sync_inodes_sb() can race against cgwb (cgroup writeback) membership switches and fail to writeback some inodes. For example, if an inode switches to another wb while sync_inodes_sb() is in progress, the new wb might not be visible to bdi_split_work_to_wbs() at all or the inode might jump from a wb which hasn't issued writebacks yet to one which already has. This patch adds backing_dev_info->wb_switch_rwsem to synchronize cgwb switch path against sync_inodes_sb() so that sync_inodes_sb() is guaranteed to see all the target wbs and inodes can't jump wbs to escape syncing. v2: Fixed misplaced rwsem init. Spotted by Jiufei. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Jiufei Xue <xuejiufei@gmail.com> Link: http://lkml.kernel.org/r/dc694ae2-f07f-61e1-7097-7c8411cee12d@gmail.com Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
04edf02cdd |
fs: Convert writeback to XArray
A couple of short loops. Signed-off-by: Matthew Wilcox <willy@infradead.org> |
||
![]() |
b8b784958e |
bdi: Fix oops in wb_workfn()
Syzbot has reported that it can hit a NULL pointer dereference in
wb_workfn() due to wb->bdi->dev being NULL. This indicates that
wb_workfn() was called for an already unregistered bdi which should not
happen as wb_shutdown() called from bdi_unregister() should make sure
all pending writeback works are completed before bdi is unregistered.
Except that wb_workfn() itself can requeue the work with:
mod_delayed_work(bdi_wq, &wb->dwork, 0);
and if this happens while wb_shutdown() is waiting in:
flush_delayed_work(&wb->dwork);
the dwork can get executed after wb_shutdown() has finished and
bdi_unregister() has cleared wb->bdi->dev.
Make wb_workfn() use wakeup_wb() for requeueing the work which takes all
the necessary precautions against racing with bdi unregistration.
CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
CC: Tejun Heo <tj@kernel.org>
Fixes:
|
||
![]() |
2e898e4c0a |
writeback: safer lock nesting
lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if
the page's memcg is undergoing move accounting, which occurs when a
process leaves its memcg for a new one that has
memory.move_charge_at_immigrate set.
unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if
the given inode is switching writeback domains. Switches occur when
enough writes are issued from a new domain.
This existing pattern is thus suspicious:
lock_page_memcg(page);
unlocked_inode_to_wb_begin(inode, &locked);
...
unlocked_inode_to_wb_end(inode, locked);
unlock_page_memcg(page);
If both inode switch and process memcg migration are both in-flight then
unlocked_inode_to_wb_end() will unconditionally enable interrupts while
still holding the lock_page_memcg() irq spinlock. This suggests the
possibility of deadlock if an interrupt occurs before unlock_page_memcg().
truncate
__cancel_dirty_page
lock_page_memcg
unlocked_inode_to_wb_begin
unlocked_inode_to_wb_end
<interrupts mistakenly enabled>
<interrupt>
end_page_writeback
test_clear_page_writeback
lock_page_memcg
<deadlock>
unlock_page_memcg
Due to configuration limitations this deadlock is not currently possible
because we don't mix cgroup writeback (a cgroupv2 feature) and
memory.move_charge_at_immigrate (a cgroupv1 feature).
If the kernel is hacked to always claim inode switching and memcg
moving_account, then this script triggers lockup in less than a minute:
cd /mnt/cgroup/memory
mkdir a b
echo 1 > a/memory.move_charge_at_immigrate
echo 1 > b/memory.move_charge_at_immigrate
(
echo $BASHPID > a/cgroup.procs
while true; do
dd if=/dev/zero of=/mnt/big bs=1M count=256
done
) &
while true; do
sync
done &
sleep 1h &
SLEEP=$!
while true; do
echo $SLEEP > a/cgroup.procs
echo $SLEEP > b/cgroup.procs
done
The deadlock does not seem possible, so it's debatable if there's any
reason to modify the kernel. I suggest we should to prevent future
surprises. And Wang Long said "this deadlock occurs three times in our
environment", so there's more reason to apply this, even to stable.
Stable 4.4 has minor conflicts applying this patch. For a clean 4.4 patch
see "[PATCH for-4.4] writeback: safer lock nesting"
https://lkml.org/lkml/2018/4/11/146
Wang Long said "this deadlock occurs three times in our environment"
[gthelen@google.com: v4]
Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com
[akpm@linux-foundation.org: comment tweaks, struct initialization simplification]
Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613
Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com
Fixes:
|
||
![]() |
b93b016313 |
page cache: use xa_lock
Remove the address_space ->tree_lock and use the xa_lock newly added to the radix_tree_root. Rename the address_space ->page_tree to ->i_pages, since we don't really care that it's a tree. [willy@infradead.org: fix nds32, fs/dax.c] Link: http://lkml.kernel.org/r/20180406145415.GB20605@bombadil.infradead.orgLink: http://lkml.kernel.org/r/20180313132639.17387-9-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Jeff Layton <jlayton@redhat.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
![]() |
0e11f6443f |
fs: move I_DIRTY_INODE to fs.h
And use it in a few more places rather than opencoding the values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
![]() |
bbbc3c1cfa |
writeback: update comment in inode_io_list_move_locked
The @head can be wb->b_dirty_time, so update the comment. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Wang Long <wanglong19@meituan.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
1751e8a6cb |
Rename superblock flags (MS_xyz -> SB_xyz)
This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/*: it generally should *not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
![]() |
8264c3214f |
writeback: merge try_to_writeback_inodes_sb_nr() into caller
Since commit
|
||
![]() |
85009b4f5f |
writeback: eliminate work item allocation in bd_start_writeback()
Handle start-all writeback like we do periodic or kupdate style writeback - by marking the bdi_writeback as needing a full flush, and simply waking the thread. This eliminates the need to allocate and queue a specific work item just for this purpose. After this change, we truly only ever have one of them running at any point in time. We mark the need to start all flushes, and the writeback thread will clear it once it has processed the request. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
aac8d41cd4 |
writeback: only allow one inflight and pending full flush
When someone calls wakeup_flusher_threads() or wakeup_flusher_threads_bdi(), they schedule writeback of all dirty pages in the system (or on that bdi). If we are tight on memory, we can get tons of these queued from kswapd/vmscan. This causes (at least) two problems: 1) We consume a ton of memory just allocating writeback work items. We've seen as much as 600 million of these writeback work items pending. That's a lot of memory to pointlessly hold hostage, while the box is under memory pressure. 2) We spend so much time processing these work items, that we introduce a softlockup in writeback processing. This is because each of the writeback work items don't end up doing any work (it's hard when you have millions of identical ones coming in to the flush machinery), so we just sit in a tight loop pulling work items and deleting/freeing them. Fix this by adding a 'start_all' bit to the writeback structure, and set that when someone attempts to flush all dirty pages. The bit is cleared when we start writeback on that work item. If the bit is already set when we attempt to queue !nr_pages writeback, then we simply ignore it. This provides us one full flush in flight, with one pending as well, and makes for more efficient handling of this type of writeback. Acked-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: Chris Mason <clm@fb.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
![]() |
e8e8a0c6c9 |
writeback: move nr_pages == 0 logic to one location
Now that we have no external callers of wb_start_writeback(), we can shuffle the passing in of 'nr_pages'. Everybody passes in 0 at this point, so just kill the argument and move the dirty count retrieval to that function. Acked-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: Chris Mason <clm@fb.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |