android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Greg Kroah-Hartman	d9b90a99f3	Merge 5.10.156 into android12-5.10-lts Changes in 5.10.156 ASoC: wm5102: Revert "ASoC: wm5102: Fix PM disable depth imbalance in wm5102_probe" ASoC: wm5110: Revert "ASoC: wm5110: Fix PM disable depth imbalance in wm5110_probe" ASoC: wm8997: Revert "ASoC: wm8997: Fix PM disable depth imbalance in wm8997_probe" ASoC: mt6660: Keep the pm_runtime enables before component stuff in mt6660_i2c_probe ASoC: wm8962: Add an event handler for TEMP_HP and TEMP_SPK spi: intel: Fix the offset to get the 64K erase opcode ASoC: codecs: jz4725b: add missed Line In power control bit ASoC: codecs: jz4725b: fix reported volume for Master ctl ASoC: codecs: jz4725b: use right control for Capture Volume ASoC: codecs: jz4725b: fix capture selector naming selftests/futex: fix build for clang selftests/intel_pstate: fix build for ARCH=x86_64 ASoC: rt1308-sdw: add the default value of some registers drm/amd/display: Remove wrong pipe control lock NFSv4: Retry LOCK on OLD_STATEID during delegation return i2c: tegra: Allocate DMA memory for DMA engine i2c: i801: add lis3lv02d's I2C address for Vostro 5568 drm/imx: imx-tve: Fix return type of imx_tve_connector_mode_valid btrfs: remove pointless and double ulist frees in error paths of qgroup tests Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm ASoC: codecs: jz4725b: Fix spelling mistake "Sourc" -> "Source", "Routee" -> "Route" ALSA: hda/realtek: fix speakers and micmute on HP 855 G8 mtd: spi-nor: intel-spi: Disable write protection only if asked spi: intel: Use correct mask for flash and protected regions mmc: sdhci-esdhc-imx: use the correct host caps for MMC_CAP_8_BIT_DATA drm/amd/pm: support power source switch on Sienna Cichlid drm/amd/pm: Read BIF STRAP also for BACO check drm/amd/pm: disable BACO entry/exit completely on several sienna cichlid cards drm/amdgpu: disable BACO on special BEIGE_GOBY card spi: stm32: Print summary 'callbacks suppressed' message ASoC: core: Fix use-after-free in snd_soc_exit() ASoC: tas2770: Fix set_tdm_slot in case of single slot ASoC: tas2764: Fix set_tdm_slot in case of single slot serial: 8250: Remove serial_rs485 sanitization from em485 serial: 8250: omap: Fix missing PM runtime calls for omap8250_set_mctrl() serial: 8250_omap: remove wait loop from Errata i202 workaround serial: 8250: omap: Fix unpaired pm_runtime_put_sync() in omap8250_remove() serial: 8250: omap: Flush PM QOS work on remove serial: imx: Add missing .thaw_noirq hook tty: n_gsm: fix sleep-in-atomic-context bug in gsm_control_send bpf, test_run: Fix alignment problem in bpf_prog_test_run_skb() ASoC: soc-utils: Remove __exit for snd_soc_util_exit() sctp: remove the unnecessary sinfo_stream check in sctp_prsctp_prune_unsent sctp: clear out_curr if all frag chunks of current msg are pruned block: sed-opal: kmalloc the cmd/resp buffers arm64: Fix bit-shifting UB in the MIDR_CPU_MODEL() macro siox: fix possible memory leak in siox_device_add() parport_pc: Avoid FIFO port location truncation pinctrl: devicetree: fix null pointer dereferencing in pinctrl_dt_to_map drm/panel: simple: set bpc field for logic technologies displays drm/drv: Fix potential memory leak in drm_dev_init() drm: Fix potential null-ptr-deref in drm_vblank_destroy_worker() ARM: dts: imx7: Fix NAND controller size-cells arm64: dts: imx8mm: Fix NAND controller size-cells arm64: dts: imx8mn: Fix NAND controller size-cells ata: libata-transport: fix double ata_host_put() in ata_tport_add() ata: libata-transport: fix error handling in ata_tport_add() ata: libata-transport: fix error handling in ata_tlink_add() ata: libata-transport: fix error handling in ata_tdev_add() bpf: Initialize same number of free nodes for each pcpu_freelist net: bgmac: Drop free_netdev() from bgmac_enet_remove() mISDN: fix possible memory leak in mISDN_dsp_element_register() net: hinic: Fix error handling in hinic_module_init() net: liquidio: release resources when liquidio driver open failed mISDN: fix misuse of put_device() in mISDN_register_device() net: macvlan: Use built-in RCU list checking net: caif: fix double disconnect client in chnl_net_open() bnxt_en: Remove debugfs when pci_register_driver failed xen/pcpu: fix possible memory leak in register_pcpu() net: ionic: Fix error handling in ionic_init_module() net: ena: Fix error handling in ena_init() drbd: use after free in drbd_create_device() platform/x86/intel: pmc: Don't unconditionally attach Intel PMC when virtualized cifs: add check for returning value of SMB2_close_init net: ag71xx: call phylink_disconnect_phy if ag71xx_hw_enable() fail in ag71xx_open() net/x25: Fix skb leak in x25_lapb_receive_frame() cifs: Fix wrong return value checking when GETFLAGS net: thunderbolt: Fix error handling in tbnet_init() cifs: add check for returning value of SMB2_set_info_init ftrace: Fix the possible incorrect kernel message ftrace: Optimize the allocation for mcount entries ftrace: Fix null pointer dereference in ftrace_add_mod() ring_buffer: Do not deactivate non-existant pages tracing/ring-buffer: Have polling block on watermark tracing: Fix memory leak in test_gen_synth_cmd() and test_empty_synth_event() tracing: Fix wild-memory-access in register_synth_event() tracing: kprobe: Fix potential null-ptr-deref on trace_event_file in kprobe_event_gen_test_exit() tracing: kprobe: Fix potential null-ptr-deref on trace_array in kprobe_event_gen_test_exit() ALSA: usb-audio: Drop snd_BUG_ON() from snd_usbmidi_output_open() ALSA: hda/realtek: fix speakers for Samsung Galaxy Book Pro ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360 Revert "usb: dwc3: disable USB core PHY management" slimbus: stream: correct presence rate frequencies speakup: fix a segfault caused by switching consoles USB: bcma: Make GPIO explicitly optional USB: serial: option: add Sierra Wireless EM9191 USB: serial: option: remove old LARA-R6 PID USB: serial: option: add u-blox LARA-R6 00B modem USB: serial: option: add u-blox LARA-L6 modem USB: serial: option: add Fibocom FM160 0x0111 composition usb: add NO_LPM quirk for Realforce 87U Keyboard usb: chipidea: fix deadlock in ci_otg_del_timer usb: typec: mux: Enter safe mode only when pins need to be reconfigured iio: adc: at91_adc: fix possible memory leak in at91_adc_allocate_trigger() iio: trigger: sysfs: fix possible memory leak in iio_sysfs_trig_init() iio: adc: mp2629: fix wrong comparison of channel iio: adc: mp2629: fix potential array out of bound access iio: pressure: ms5611: changed hardcoded SPI speed to value limited dm ioctl: fix misbehavior if list_versions races with module loading serial: 8250: Fall back to non-DMA Rx if IIR_RDI occurs serial: 8250: Flush DMA Rx on RLSI serial: 8250_lpss: Configure DMA also w/o DMA filter Input: iforce - invert valid length check when fetching device IDs maccess: Fix writing offset in case of fault in strncpy_from_kernel_nofault() scsi: zfcp: Fix double free of FSF request when qdio send fails iommu/vt-d: Set SRE bit only when hardware has SRS cap firmware: coreboot: Register bus in module init mmc: core: properly select voltage range without power cycle mmc: sdhci-pci-o2micro: fix card detect fail issue caused by CD# debounce timeout mmc: sdhci-pci: Fix possible memory leak caused by missing pci_dev_put() docs: update mediator contact information in CoC doc misc/vmw_vmci: fix an infoleak in vmci_host_do_receive_datagram() perf/x86/intel/pt: Fix sampling using single range output nvme: restrict management ioctls to admin nvme: ensure subsystem reset is single threaded net: fix a concurrency bug in l2tp_tunnel_register() ring-buffer: Include dropped pages in counting dirty patches usbnet: smsc95xx: Fix deadlock on runtime resume stddef: Introduce struct_group() helper macro net: use struct_group to copy ip/ipv6 header addresses scsi: target: tcm_loop: Fix possible name leak in tcm_loop_setup_hba_bus() scsi: scsi_debug: Fix possible UAF in sdebug_add_host_helper() kprobes: Skip clearing aggrprobe's post_handler in kprobe-on-ftrace case Input: i8042 - fix leaking of platform device on module removal uapi/linux/stddef.h: Add include guards macvlan: enforce a consistent minimal mtu tcp: cdg: allow tcp_cdg_release() to be called multiple times kcm: avoid potential race in kcm_tx_work kcm: close race conditions on sk_receive_queue 9p: trans_fd/p9_conn_cancel: drop client lock earlier gfs2: Check sb_bsize_shift after reading superblock gfs2: Switch from strlcpy to strscpy 9p/trans_fd: always use O_NONBLOCK read/write mm: fs: initialize fsdata passed to write_begin/write_end interface ntfs: fix use-after-free in ntfs_attr_find() ntfs: fix out-of-bounds read in ntfs_attr_find() ntfs: check overflow when iterating ATTR_RECORDs Revert "net: broadcom: Fix BCMGENET Kconfig" Linux 5.10.156 Change-Id: Ic9fe339913a510cc9fb9c4557b3bd6e196db834f Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>	2022-12-02 08:42:05 +00:00
Alexander Potapenko	294ef12dcc	mm: fs: initialize fsdata passed to write_begin/write_end interface commit 1468c6f4558b1bcd92aa0400f2920f9dc7588402 upstream. Functions implementing the a_ops->write_end() interface accept the `void fsdata` parameter that is supposed to be initialized by the corresponding a_ops->write_begin() (which accepts `void *fsdata`). However not all a_ops->write_begin() implementations initialize `fsdata` unconditionally, so it may get passed uninitialized to a_ops->write_end(), resulting in undefined behavior. Fix this by initializing fsdata with NULL before the call to write_begin(), rather than doing so in all possible a_ops implementations. This patch covers only the following cases found by running x86 KMSAN under syzkaller: - generic_perform_write() - cont_expand_zero() and generic_cont_expand_simple() - page_symlink() Other cases of passing uninitialized fsdata may persist in the codebase. Link: https://lkml.kernel.org/r/20220915150417.722975-43-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-25 17:45:56 +01:00
Greg Kroah-Hartman	91be4236fb	ANDROID: GKI: remove vfs-only namespace from 2 symbols Commit `d483eed85f` ("ANDROID: GKI: set vfs-only exports into their own namespace") moved a bunch of symbols into a vfs-only namespace to make it possible for some external filesystem modules to be able to use them. Unfortunately the following two symbols were already being marked used by external modules, and moving them into a different namespace broke existing users of these symbols: kern_path __sync_dirty_buffer The ABI checking tools do not take the namespace of the symbol into consideration when checking, as that is a Linux kernel "add-on" and not part of the kernel symbol table information directly, which is why this was not caught earlier. Bug: 157965270 Bug: 210074446 Bug: 216253405 Bug: 219830266 Fixes: `d483eed85f` ("ANDROID: GKI: set vfs-only exports into their own namespace") Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I4a791edb33312da232cb088613bd4eb8f5548239	2022-03-08 07:19:02 +00:00
Greg Kroah-Hartman	d483eed85f	ANDROID: GKI: set vfs-only exports into their own namespace We have namespaces, so use them for all vfs-exported namespaces so that filesystems can use them, but not anything else. Some in-kernel drivers that do direct filesystem accesses (because they serve up files) are also allowed access to these symbols to keep 'make allmodconfig' builds working properly, but it is not needed for Android kernel images. Bug: 157965270 Bug: 210074446 Cc: Matthias Maennich <maennich@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Iaf6140baf3a18a516ab2d5c3966235c42f3f70de	2022-01-11 09:30:47 +01:00
Minchan Kim	a9ac6ae90e	BACKPORT: UPSTREAM: mm: fs: invalidate bh_lrus for only cold path kernel test robot reported the regression of fio.write_iops[1] with [2]. Since lru_add_drain is called frequently, invalidate bh_lrus there could increase bh_lrus cache miss ratio, which needs more IO in the end. This patch moves the bh_lrus invalidation from the hot path( e.g., zap_page_range, pagevec_release) to cold path(i.e., lru_add_drain_all, lru_cache_disable). "Xing, Zhengjun" confirmed : I test the patch, the regression reduced to -2.9%. [1] https://lore.kernel.org/lkml/20210520083144.GD14190@xsang-OptiPlex-9020/ [2] 8cc621d2f45d, mm: fs: invalidate BH LRU during page migration Bug: 194673488 Link: https://lkml.kernel.org/r/20210907212347.1977686-1-minchan@kernel.org (cherry picked from commit 243418e3925d5b5b0657ae54c322d43035e97eed) [Chris: resolved conflicts due to Minchan's AOSP LRU commits] Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Tested-by: "Xing, Zhengjun" <zhengjun.xing@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com> Change-Id: Icc5e456b058df516480b4378853464d6d7b43505	2021-09-27 17:48:37 -07:00
Chris Goldsworthy	434df9f35d	Revert "FROMLIST: fs/buffer.c: Revoke LRU when trying to drop buffers" This reverts commit `7d212a5102`. This commit is superseded by commit `a0a0b3f42e` ("FROMLIST: mm: fs: Invalidate BH LRU during page migration"). Conflicts: fs/buffer.c 1. In fs/buffer.c had to keep invalidate_bh_lrus_cpu() function that was introduced in commit `a0a0b3f42e` as well as in the reverted commit. Bug: 174118021 Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I60d0a4e8beb35389727f29fea6ca4640ecee40a7	2021-03-24 17:40:49 +00:00
Minchan Kim	a0a0b3f42e	FROMLIST: mm: fs: Invalidate BH LRU during page migration Pages containing buffer_heads that are in one of the per-CPU buffer_head LRU caches will be pinned and thus cannot be migrated. This can prevent CMA allocations from succeeding, which are often used on platforms with co-processors (such as a DSP) that can only use physically contiguous memory. It can also prevent memory hot-unplugging from succeeding, which involves migrating at least MIN_MEMORY_BLOCK_SIZE bytes of memory, which ranges from 8 MiB to 1 GiB based on the architecture in use. Correspondingly, invalidate the BH LRU caches before a migration starts and stop any buffer_head from being cached in the LRU caches, until migration has finished. Bug: 180018981 Link: https://lore.kernel.org/linux-mm/20210319175127.886124-3-minchan@kernel.org/ Tested-by: Oliver Sang <oliver.sang@intel.com> Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Idb8279cb561812f5f1b43ddbb742c1808700754e	2021-03-23 04:05:31 +00:00
Minchan Kim	9e2790466f	Revert "FROMLIST: mm: fs: Invalidate BH LRU during page migration" This reverts commit `169ddec367`. Bug: 180018981 Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: Ib3b370027ebf377453266761edb0a147d8cb0959	2021-03-23 04:04:42 +00:00
Minchan Kim	169ddec367	FROMLIST: mm: fs: Invalidate BH LRU during page migration Pages containing buffer_heads that are in one of the per-CPU buffer_head LRU caches will be pinned and thus cannot be migrated. This can prevent CMA allocations from succeeding, which are often used on platforms with co-processors (such as a DSP) that can only use physically contiguous memory. It can also prevent memory hot-unplugging from succeeding, which involves migrating at least MIN_MEMORY_BLOCK_SIZE bytes of memory, which ranges from 8 MiB to 1 GiB based on the architecture in use. Correspondingly, invalidate the BH LRU caches before a migration starts and stop any buffer_head from being cached in the LRU caches, until migration has finished. Bug: 180018981 Link: https://lore.kernel.org/linux-mm/20210310161429.399432-3-minchan@kernel.org/ Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Minchan Kim <minchan@google.com> Change-Id: I7ac085c2ec14a81c3c4d7b65a7eeedb0cfba4ea6	2021-03-12 12:35:53 -08:00
Jaegeuk Kim	d92620d79b	Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.10.y' into android12-5.10 * aosp/upstream-f2fs-stable-linux-5.10.y: fs-verity: support reading signature with ioctl fs-verity: support reading descriptor with ioctl fs-verity: support reading Merkle tree with ioctl fs-verity: add FS_IOC_READ_VERITY_METADATA ioctl fs-verity: don't pass whole descriptor to fsverity_verify_signature() fs-verity: factor out fsverity_get_descriptor() fs: simplify freeze_bdev/thaw_bdev f2fs: remove FAULT_ALLOC_BIO f2fs: use blkdev_issue_flush in __submit_flush_wait f2fs: remove a few bd_part checks Documentation: f2fs: fix typo s/automaic/automatic f2fs: give a warning only for readonly partition f2fs: don't grab superblock freeze for flush/ckpt thread f2fs: add ckpt_thread_ioprio sysfs node f2fs: introduce checkpoint_merge mount option f2fs: relocate inline conversion from mmap() to mkwrite() f2fs: fix a wrong condition in __submit_bio f2fs: remove unnecessary initialization in xattr.c f2fs: fix to avoid inconsistent quota data f2fs: flush data when enabling checkpoint back f2fs: deprecate f2fs_trace_io f2fs: Remove readahead collision detection f2fs: remove unused stat_{inc, dec}_atomic_write f2fs: introduce sb_status sysfs node f2fs: fix to use per-inode maxbytes f2fs: compress: fix potential deadlock libfs: unexport generic_ci_d_compare() and generic_ci_d_hash() f2fs: fix to set/clear I_LINKABLE under i_lock f2fs: fix null page reference in redirty_blocks f2fs: clean up post-read processing f2fs: trival cleanup in move_data_block() f2fs: fix out-of-repair __setattr_copy() f2fs: fix to tag FIEMAP_EXTENT_MERGED in f2fs_fiemap() f2fs: introduce a new per-sb directory in sysfs f2fs: compress: support compress level f2fs: compress: deny setting unsupported compress algorithm f2fs: relocate f2fs_precache_extents() f2fs: enforce the immutable flag on open files f2fs: enhance to update i_mode and acl atomically in f2fs_setattr() f2fs: fix to set inode->i_mode correctly for posix_acl_update_mode f2fs: Replace expression with offsetof() f2fs: handle unallocated section and zone on pinned/atgc Bug: 178226640 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Change-Id: I95112779a0a75f3cdbc222126a198d54f1e378ac	2021-03-01 19:06:56 -08:00
Christoph Hellwig	d96cd09aed	fs: simplify freeze_bdev/thaw_bdev Store the frozen superblock in struct block_device to avoid the awkward interface that can return a sb only used a cookie, an ERR_PTR or NULL. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Chao Yu <yuchao0@huawei.com> [f2fs] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-02-22 09:32:55 -08:00
Laura Abbott	7d212a5102	FROMLIST: fs/buffer.c: Revoke LRU when trying to drop buffers When a buffer is added to the LRU list, a reference is taken which is not dropped until the buffer is evicted from the LRU list. This is the correct behavior, however this LRU reference will prevent the buffer from being dropped. This means that the buffer can't actually be dropped until it is selected for eviction. There's no bound on the time spent on the LRU list, which means that the buffer may be undroppable for very long periods of time. Given that migration involves dropping buffers, the associated page is now unmigratible for long periods of time as well. CMA relies on being able to migrate a specific range of pages, so these types of failures make CMA significantly less reliable, especially under high filesystem usage. Rather than waiting for the LRU algorithm to eventually kick out the buffer, explicitly remove the buffer from the LRU list when trying to drop it. There is still the possibility that the buffer could be added back on the list, but that indicates the buffer is still in use and would probably have other 'in use' indicates to prevent dropping. Note: a bug reported by "kernel test robot" lead to a switch from using xas_for_each() to xa_for_each(). Bug: 174118021 Link: https://lore.kernel.org/linux-fsdevel/cover.1611642038.git.cgoldswo@codeaurora.org/ Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Cc: Matthew Wilcox <willy@infradead.org> Reported-by: kernel test robot <oliver.sang@intel.com> Change-Id: I561fa3ac7e8874e27d4ad8e1d62ab62e18dd419c	2021-01-28 09:49:23 +00:00
Alistair Delva	fefe6520ad	Revert "FROMLIST: fs/buffer.c: Revoke LRU when trying to drop buffers" This reverts commit `43edfc892e`. Reason for revert: broke various vts tests Bug: 174118021 Change-Id: I5b0c802c5da476a78d7a3e125043156e3eb4b58d Signed-off-by: Alistair Delva <adelva@google.com>	2021-01-21 23:48:39 +00:00
Laura Abbott	43edfc892e	FROMLIST: fs/buffer.c: Revoke LRU when trying to drop buffers When a buffer is added to the LRU list, a reference is taken which is not dropped until the buffer is evicted from the LRU list. This is the correct behavior, however this LRU reference will prevent the buffer from being dropped. This means that the buffer can't actually be dropped until it is selected for eviction. There's no bound on the time spent on the LRU list, which means that the buffer may be undroppable for very long periods of time. Given that migration involves dropping buffers, the associated page is now unmigratible for long periods of time as well. CMA relies on being able to migrate a specific range of pages, so these types of failures make CMA significantly less reliable, especially under high filesystem usage. Rather than waiting for the LRU algorithm to eventually kick out the buffer, explicitly remove the buffer from the LRU list when trying to drop it. There is still the possibility that the buffer could be added back on the list, but that indicates the buffer is still in use and would probably have other 'in use' indicates to prevent dropping. Note: a bug reported by "kernel test robot" lead to a switch from using xas_for_each() to xa_for_each(). Bug: 174118021 Link: https://lore.kernel.org/linux-mm/cover.1610572007.git.cgoldswo@codeaurora.org/ Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Cc: Matthew Wilcox <willy@infradead.org> Reported-by: kernel test robot <oliver.sang@intel.com> Change-Id: I4a93c4ed81c57874764d12f3beea1194a30c13b2	2021-01-14 03:04:05 +00:00
Roman Gushchin	b87d8cefe4	mm, memcg: rework remote charging API to support nesting Currently the remote memcg charging API consists of two functions: memalloc_use_memcg() and memalloc_unuse_memcg(), which set and clear the memcg value, which overwrites the memcg of the current task. memalloc_use_memcg(target_memcg); <...> memalloc_unuse_memcg(); It works perfectly for allocations performed from a normal context, however an attempt to call it from an interrupt context or just nest two remote charging blocks will lead to an incorrect accounting. On exit from the inner block the active memcg will be cleared instead of being restored. memalloc_use_memcg(target_memcg); memalloc_use_memcg(target_memcg_2); <...> memalloc_unuse_memcg(); Error: allocation here are charged to the memcg of the current process instead of target_memcg. memalloc_unuse_memcg(); This patch extends the remote charging API by switching to a single function: struct mem_cgroup set_active_memcg(struct mem_cgroup memcg), which sets the new value and returns the old one. So a remote charging block will look like: old_memcg = set_active_memcg(target_memcg); <...> set_active_memcg(old_memcg); This patch is heavily based on the patch by Johannes Weiner, which can be found here: https://lkml.org/lkml/2020/5/28/806 . Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Schatzberg <dschatzberg@fb.com> Link: https://lkml.kernel.org/r/20200821212056.3769116-1-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-10-18 09:27:09 -07:00
Jan Kara	6dbf7bb555	fs: Don't invalidate page buffers in block_write_full_page() If block_write_full_page() is called for a page that is beyond current inode size, it will truncate page buffers for the page and return 0. This logic has been added in 2.5.62 in commit 81eb69062588 ("fix ext3 BUG due to race with truncate") in history.git tree to fix a problem with ext3 in data=ordered mode. This particular problem doesn't exist anymore because ext3 is long gone and ext4 handles ordered data differently. Also normally buffers are invalidated by truncate code and there's no need to specially handle this in ->writepage() code. This invalidation of page buffers in block_write_full_page() is causing issues to filesystems (e.g. ext4 or ocfs2) when block device is shrunk under filesystem's hands and metadata buffers get discarded while being tracked by the journalling layer. Although it is obviously "not supported" it can cause kernel crashes like: [ 7986.689400] BUG: unable to handle kernel NULL pointer dereference at +0000000000000008 [ 7986.697197] PGD 0 P4D 0 [ 7986.699724] Oops: 0002 [#1] SMP PTI [ 7986.703200] CPU: 4 PID: 203778 Comm: jbd2/dm-3-8 Kdump: loaded Tainted: G +O --------- - - 4.18.0-147.5.0.5.h126.eulerosv2r9.x86_64 #1 [ 7986.716438] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 1.57 08/11/2015 [ 7986.723462] RIP: 0010:jbd2_journal_grab_journal_head+0x1b/0x40 [jbd2] ... [ 7986.810150] Call Trace: [ 7986.812595] __jbd2_journal_insert_checkpoint+0x23/0x70 [jbd2] [ 7986.818408] jbd2_journal_commit_transaction+0x155f/0x1b60 [jbd2] [ 7986.836467] kjournald2+0xbd/0x270 [jbd2] which is not great. The crash happens because bh->b_private is suddently NULL although BH_JBD flag is still set (this is because block_invalidatepage() cleared BH_Mapped flag and subsequent bh lookup found buffer without BH_Mapped set, called init_page_buffers() which has rewritten bh->b_private). So just remove the invalidation in block_write_full_page(). Note that the buffer cache invalidation when block device changes size is already careful to avoid similar problems by using invalidate_mapping_pages() which skips busy buffers so it was only this odd block_write_full_page() behavior that could tear down bdev buffers under filesystem's hands. Reported-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> CC: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-09-07 10:23:07 -06:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Linus Torvalds	d723b99ec9	Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Improvements to ext4's block allocator performance for very large file systems, especially when the file system or files which are highly fragmented. There is a new mount option, prefetch_block_bitmaps which will pull in the block bitmaps and set up the in-memory buddy bitmaps when the file system is initially mounted. Beyond that, a lot of bug fixes and cleanups. In particular, a number of changes to make ext4 more robust in the face of write errors or file system corruptions" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (46 commits) ext4: limit the length of per-inode prealloc list ext4: reorganize if statement of ext4_mb_release_context() ext4: add mb_debug logging when there are lost chunks ext4: Fix comment typo "the the". jbd2: clean up checksum verification in do_one_pass() ext4: change to use fallthrough macro ext4: remove unused parameter of ext4_generic_delete_entry function mballoc: replace seq_printf with seq_puts ext4: optimize the implementation of ext4_mb_good_group() ext4: delete invalid comments near ext4_mb_check_limits() ext4: fix typos in ext4_mb_regular_allocator() comment ext4: fix checking of directory entry validity for inline directories fs: prevent BUG_ON in submit_bh_wbc() ext4: correctly restore system zone info when remount fails ext4: handle add_system_zone() failure in ext4_setup_system_zone() ext4: fold ext4_data_block_valid_rcu() into the caller ext4: check journal inode extents more carefully ext4: don't allow overlapping system zones ext4: handle error of ext4_setup_system_zone() on remount ext4: delete the invalid BUGON in ext4_mb_load_buddy_gfp() ...	2020-08-21 11:03:38 -07:00
Xianting Tian	377254b2cd	fs: prevent BUG_ON in submit_bh_wbc() If a device is hot-removed --- for example, when a physical device is unplugged from pcie slot or a nbd device's network is shutdown --- this can result in a BUG_ON() crash in submit_bh_wbc(). This is because the when the block device dies, the buffer heads will have their Buffer_Mapped flag get cleared, leading to the crash in submit_bh_wbc. We had attempted to work around this problem in commit `a17712c8` ("ext4: check superblock mapped prior to committing"). Unfortunately, it's still possible to hit the BUG_ON(!buffer_mapped(bh)) if the device dies between when the work-around check in ext4_commit_super() and when submit_bh_wbh() is finally called: Code path: ext4_commit_super judge if 'buffer_mapped(sbh)' is false, return <== commit `a17712c8` lock_buffer(sbh) ... unlock_buffer(sbh) __sync_dirty_buffer(sbh,... lock_buffer(sbh) judge if 'buffer_mapped(sbh))' is false, return <== added by this patch submit_bh(...,sbh) submit_bh_wbc(...,sbh,...) [100722.966497] kernel BUG at fs/buffer.c:3095! <== BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc() [100722.966503] invalid opcode: 0000 [#1] SMP [100722.966566] task: ffff8817e15a9e40 task.stack: ffffc90024744000 [100722.966574] RIP: 0010:submit_bh_wbc+0x180/0x190 [100722.966575] RSP: 0018:ffffc90024747a90 EFLAGS: 00010246 [100722.966576] RAX: 0000000000620005 RBX: ffff8818a80603a8 RCX: 0000000000000000 [100722.966576] RDX: ffff8818a80603a8 RSI: 0000000000020800 RDI: 0000000000000001 [100722.966577] RBP: ffffc90024747ac0 R08: 0000000000000000 R09: ffff88207f94170d [100722.966578] R10: 00000000000437c8 R11: 0000000000000001 R12: 0000000000020800 [100722.966578] R13: 0000000000000001 R14: 000000000bf9a438 R15: ffff88195f333000 [100722.966580] FS: 00007fa2eee27700(0000) GS:ffff88203d840000(0000) knlGS:0000000000000000 [100722.966580] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [100722.966581] CR2: 0000000000f0b008 CR3: 000000201a622003 CR4: 00000000007606e0 [100722.966582] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [100722.966583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [100722.966583] PKRU: 55555554 [100722.966583] Call Trace: [100722.966588] __sync_dirty_buffer+0x6e/0xd0 [100722.966614] ext4_commit_super+0x1d8/0x290 [ext4] [100722.966626] __ext4_std_error+0x78/0x100 [ext4] [100722.966635] ? __ext4_journal_get_write_access+0xca/0x120 [ext4] [100722.966646] ext4_reserve_inode_write+0x58/0xb0 [ext4] [100722.966655] ? ext4_dirty_inode+0x48/0x70 [ext4] [100722.966663] ext4_mark_inode_dirty+0x53/0x1e0 [ext4] [100722.966671] ? __ext4_journal_start_sb+0x6d/0xf0 [ext4] [100722.966679] ext4_dirty_inode+0x48/0x70 [ext4] [100722.966682] __mark_inode_dirty+0x17f/0x350 [100722.966686] generic_update_time+0x87/0xd0 [100722.966687] touch_atime+0xa9/0xd0 [100722.966690] generic_file_read_iter+0xa09/0xcd0 [100722.966694] ? page_cache_tree_insert+0xb0/0xb0 [100722.966704] ext4_file_read_iter+0x4a/0x100 [ext4] [100722.966707] ? __inode_security_revalidate+0x4f/0x60 [100722.966709] __vfs_read+0xec/0x160 [100722.966711] vfs_read+0x8c/0x130 [100722.966712] SyS_pread64+0x87/0xb0 [100722.966716] do_syscall_64+0x67/0x1b0 [100722.966719] entry_SYSCALL64_slow_path+0x25/0x25 To address this, add the check of 'buffer_mapped(bh)' to __sync_dirty_buffer(). This also has the benefit of fixing this for other file systems. With this addition, we can drop the workaround in ext4_commit_supper(). [ Commit description rewritten by tytso. ] Signed-off-by: Xianting Tian <xianting_tian@126.com> Link: https://lore.kernel.org/r/1596211825-8750-1-git-send-email-xianting_tian@126.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-08-07 15:44:59 -04:00
Linus Torvalds	382625d0d4	Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block Pull core block updates from Jens Axboe: "Good amount of cleanups and tech debt removals in here, and as a result, the diffstat shows a nice net reduction in code. - Softirq completion cleanups (Christoph) - Stop using ->queuedata (Christoph) - Cleanup bd claiming (Christoph) - Use check_events, moving away from the legacy media change (Christoph) - Use inode i_blkbits consistently (Christoph) - Remove old unused writeback congestion bits (Christoph) - Cleanup/unify submission path (Christoph) - Use bio_uninit consistently, instead of bio_disassociate_blkg (Christoph) - sbitmap cleared bits handling (John) - Request merging blktrace event addition (Jan) - sysfs add/remove race fixes (Luis) - blk-mq tag fixes/optimizations (Ming) - Duplicate words in comments (Randy) - Flush deferral cleanup (Yufen) - IO context locking/retry fixes (John) - struct_size() usage (Gustavo) - blk-iocost fixes (Chengming) - blk-cgroup IO stats fixes (Boris) - Various little fixes" * tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits) block: blk-timeout: delete duplicated word block: blk-mq-sched: delete duplicated word block: blk-mq: delete duplicated word block: genhd: delete duplicated words block: elevator: delete duplicated word and fix typos block: bio: delete duplicated words block: bfq-iosched: fix duplicated word iocost_monitor: start from the oldest usage index iocost: Fix check condition of iocg abs_vdebt block: Remove callback typedefs for blk_mq_ops block: Use non _rcu version of list functions for tag_set_list blk-cgroup: show global disk stats in root cgroup io.stat blk-cgroup: make iostat functions visible to stat printing block: improve discard bio alignment in __blkdev_issue_discard() block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers block: defer flush request no matter whether we have elevator block: make blk_timeout_init() static block: remove retry loop in ioc_release_fn() block: remove unnecessary ioc nested locking block: integrate bd_start_claiming into __blkdev_get ...	2020-08-03 11:57:03 -07:00
Eric Biggers	4f74d15fe4	ext4: add inline encryption support Wire up ext4 to support inline encryption via the helper functions which fs/crypto/ now provides. This includes: - Adding a mount option 'inlinecrypt' which enables inline encryption on encrypted files where it can be used. - Setting the bio_crypt_ctx on bios that will be submitted to an inline-encrypted file. Note: submit_bh_wbc() in fs/buffer.c also needed to be patched for this part, since ext4 sometimes uses ll_rw_block() on file data. - Not adding logically discontiguous data to bios that will be submitted to an inline-encrypted file. - Not doing filesystem-layer crypto on inline-encrypted files. Co-developed-by: Satya Tangirala <satyat@google.com> Signed-off-by: Satya Tangirala <satyat@google.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20200702015607.1215430-5-satyat@google.com Signed-off-by: Eric Biggers <ebiggers@google.com>	2020-07-08 10:29:43 -07:00
Christoph Hellwig	ed9b3196d2	fs: remove a weird comment in submit_bh_wbc All bios can get remapped if submitted to partitions. No need to comment on that. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-07-01 07:27:23 -06:00
Guoqing Jiang	45dcfc2732	fs/buffer.c: use attach/detach_page_private Since the new pair function is introduced, we can call them to clean the code in buffer.c. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/20200517214718.468-5-guoqing.jiang@cloud.ionos.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-02 10:59:07 -07:00
Jeff Layton	485e9605c0	fs/buffer.c: record blockdev write errors in super_block that it backs When syncing out a block device (a'la __sync_blockdev), any error encountered will only be recorded in the bd_inode's mapping. When the blockdev contains a filesystem however, we'd like to also record the error in the super_block that's stored there. Make mark_buffer_write_io_error also record the error in the corresponding super_block when a writeback error occurs and the block device contains a mounted superblock. Since superblocks are RCU freed, hold the rcu_read_lock to ensure that the superblock doesn't go away while we're marking it. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andres Freund <andres@anarazel.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Link: http://lkml.kernel.org/r/20200428135155.19223-3-jlayton@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-02 10:59:05 -07:00
Linus Torvalds	3d29cb17ba	Merge tag 'block-5.7-2020-04-24' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A few fixes/changes that should go into this release: - null_blk zoned fixes (Damien) - blkdev_close() sync improvement (Douglas) - Fix regression in blk-iocost that impacted (at least) systemtap (Waiman) - Comment fix, header removal (Zhiqiang, Jianpeng)" * tag 'block-5.7-2020-04-24' of git://git.kernel.dk/linux-block: null_blk: Cleanup zoned device initialization null_blk: Fix zoned command handling block: remove unused header blk-iocost: Fix error on iocost_ioc_vrate_adj bdev: Reduce time holding bd_mutex in sync in blkdev_close() buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.	2020-04-24 12:44:19 -07:00
Zhiqiang Liu	c4b4c2a78a	buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason. free_more_memory func has been completely removed in commit `bc48f001de` ("buffer: eliminate the need to call free_more_memory() in __getblk_slow()") So comment and `WB_REASON_FREE_MORE_MEM` reason about free_more_memory are no longer needed. Fixes: `bc48f001de` ("buffer: eliminate the need to call free_more_memory() in __getblk_slow()") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-04-17 21:38:11 -06:00
Roman Gushchin	d87f639258	ext4: use non-movable memory for superblock readahead Since commit `a8ac900b81` ("ext4: use non-movable memory for the superblock") buffers for ext4 superblock were allocated using the sb_bread_unmovable() helper which allocated buffer heads out of non-movable memory blocks. It was necessarily to not block page migrations and do not cause cma allocation failures. However commit `85c8f176a6` ("ext4: preload block group descriptors") broke this by introducing pre-reading of the ext4 superblock. The problem is that __breadahead() is using __getblk() underneath, which allocates buffer heads out of movable memory. It resulted in page migration failures I've seen on a machine with an ext4 partition and a preallocated cma area. Fix this by introducing sb_breadahead_unmovable() and __breadahead_gfp() helpers which use non-movable memory for buffer head allocations and use them for the ext4 superblock readahead. Reviewed-by: Andreas Dilger <adilger@dilger.ca> Fixes: `85c8f176a6` ("ext4: preload block group descriptors") Signed-off-by: Roman Gushchin <guro@fb.com> Link: https://lore.kernel.org/r/20200229001411.128010-1-guro@fb.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2020-04-15 23:58:48 -04:00
Linus Torvalds	4b9fd8a829	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Continued user-access cleanups in the futex code. - percpu-rwsem rewrite that uses its own waitqueue and atomic_t instead of an embedded rwsem. This addresses a couple of weaknesses, but the primary motivation was complications on the -rt kernel. - Introduce raw lock nesting detection on lockdep (CONFIG_PROVE_RAW_LOCK_NESTING=y), document the raw_lock vs. normal lock differences. This too originates from -rt. - Reuse lockdep zapped chain_hlocks entries, to conserve RAM footprint on distro-ish kernels running into the "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!" depletion of the lockdep chain-entries pool. - Misc cleanups, smaller fixes and enhancements - see the changelog for details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t Documentation/locking/locktypes: Minor copy editor fixes Documentation/locking/locktypes: Further clarifications and wordsmithing m68knommu: Remove mm.h include from uaccess_no.h x86: get rid of user_atomic_cmpxchg_inatomic() generic arch_futex_atomic_op_inuser() doesn't need access_ok() x86: don't reload after cmpxchg in unsafe_atomic_op2() loop x86: convert arch_futex_atomic_op_inuser() to user_access_begin/user_access_end() objtool: whitelist __sanitizer_cov_trace_switch() [parisc, s390, sparc64] no need for access_ok() in futex handling sh: no need of access_ok() in arch_futex_atomic_op_inuser() futex: arch_futex_atomic_op_inuser() calling conventions change completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() lockdep: Add posixtimer context tracing bits lockdep: Annotate irq_work lockdep: Add hrtimer context tracing bits lockdep: Introduce wait-type checks completion: Use simple wait queues sched/swait: Prepare usage in completions ...	2020-03-30 16:17:15 -07:00
Thomas Gleixner	f1e67e355c	fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t Bit spinlocks are problematic if PREEMPT_RT is enabled, because they disable preemption, which is undesired for latency reasons and breaks when regular spinlocks are taken within the bit_spinlock locked region because regular spinlocks are converted to 'sleeping spinlocks' on RT. PREEMPT_RT replaced the bit spinlocks with regular spinlocks to avoid this problem. The replacement was done conditionaly at compile time, but Christoph requested to do an unconditional conversion. Jan suggested to move the spinlock into a existing padding hole which avoids a size increase of struct buffer_head on production kernels. As a benefit the lock gains lockdep coverage. [ bigeasy: Remove the wrapper and use always spinlock_t and move it into the padding hole ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Link: https://lkml.kernel.org/r/20191118132824.rclhrbujqh4b4g4d@linutronix.de	2020-03-28 13:21:08 +01:00
Christoph Hellwig	29125ed624	block: move guard_bio_eod to bio.c This is bio layer functionality and not related to buffer heads. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-03-25 09:50:08 -06:00
Sebastian Andrzej Siewior	cb923159bb	smp: Remove allocation mask from on_each_cpu_cond.*() The allocation mask is no longer used by on_each_cpu_cond() and on_each_cpu_cond_mask() and can be removed. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20200117090137.1205765-4-bigeasy@linutronix.de	2020-01-24 20:40:09 +01:00
Ming Lei	83c9c54716	fs: move guard_bio_eod() after bio_set_op_attrs Commit `85a8ce62c2` ("block: add bio_truncate to fix guard_bio_eod") adds bio_truncate() for handling bio EOD. However, bio_truncate() doesn't use the passed 'op' parameter from guard_bio_eod's callers. So bio_trunacate() may retrieve wrong 'op', and zering pages may not be done for READ bio. Fixes this issue by moving guard_bio_eod() after bio_set_op_attrs() in submit_bh_wbc() so that bio_truncate() can always retrieve correct op info. Meantime remove the 'op' parameter from guard_bio_eod() because it isn't used any more. Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: linux-fsdevel@vger.kernel.org Fixes: `85a8ce62c2` ("block: add bio_truncate to fix guard_bio_eod") Signed-off-by: Ming Lei <ming.lei@redhat.com> Fold in kerneldoc and bio_op() change. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-01-09 08:16:12 -07:00
Ming Lei	85a8ce62c2	block: add bio_truncate to fix guard_bio_eod Some filesystem, such as vfat, may send bio which crosses device boundary, and the worse thing is that the IO request starting within device boundaries can contain more than one segment past EOD. Commit `dce30ca9e3` ("fs: fix guard_bio_eod to check for real EOD errors") tries to fix this issue by returning -EIO for this situation. However, this way lets fs user code lose chance to handle -EIO, then sync_inodes_sb() may hang for ever. Also the current truncating on last segment is dangerous by updating the last bvec, given bvec table becomes not immutable any more, and fs bio users may not retrieve the truncated pages via bio_for_each_segment_all() in its .end_io callback. Fixes this issue by supporting multi-segment truncating. And the approach is simpler: - just update bio size since block layer can make correct bvec with the updated bio size. Then bvec table becomes really immutable. - zero all truncated segments for read bio Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: linux-fsdevel@vger.kernel.org Fixed-by: `dce30ca9e3` ("fs: fix guard_bio_eod to check for real EOD errors") Reported-by: syzbot+2b9e54155c8c25d8d165@syzkaller.appspotmail.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-12-28 09:44:56 -07:00
Ben Dooks	2b211dc04c	fs/buffer.c: include internal.h for missing declarations The declarations of __block_write_begin_int and guard_bio_eod are needed from internal.h so include it to fix the following sparse warnings: fs/buffer.c:1930:5: warning: symbol '__block_write_begin_int' was not declared. Should it be static? fs/buffer.c:2994:6: warning: symbol 'guard_bio_eod' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20191011170039.16100-1-ben.dooks@codethink.co.uk Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-12-01 06:29:17 -08:00
Saurav Girepunje	1d70667973	fs/buffer.c: fix use true/false for bool type Use true/false for bool return type of has_bh_in_lru(). Link: http://lkml.kernel.org/r/20191029040529.GA7625@saurav Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-12-01 06:29:17 -08:00
Eric Biggers	31fb992ce6	fs/buffer.c: support fscrypt in block_read_full_page() After each filesystem block (as represented by a buffer_head) has been read from disk by block_read_full_page(), decrypt it if needed. The decryption is done on the fscrypt_read_workqueue. This is the final change needed to support ext4 encryption with blocksize != PAGE_SIZE, and it's a fairly small change now that CONFIG_FS_ENCRYPTION is a bool and fs/crypto/ exposes functions to decrypt individual blocks and to enqueue work on the fscrypt workqueue. Don't try to add fs-verity support yet, as the fs/verity/ support layer isn't ready for sub-page blocks yet. Just add fscrypt support for now. Almost all the new code is compiled away when CONFIG_FS_ENCRYPTION=n. Cc: Chandan Rajendra <chandan@linux.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20191023033312.361355-2-ebiggers@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2019-11-14 16:40:45 -05:00
Linus Torvalds	9637d51734	Merge tag 'for-linus-20190715' of git://git.kernel.dk/linux-block Pull more block updates from Jens Axboe: "A later pull request with some followup items. I had some vacation coming up to the merge window, so certain things items were delayed a bit. This pull request also contains fixes that came in within the last few days of the merge window, which I didn't want to push right before sending you a pull request. This contains: - NVMe pull request, mostly fixes, but also a few minor items on the feature side that were timing constrained (Christoph et al) - Report zones fixes (Damien) - Removal of dead code (Damien) - Turn on cgroup psi memstall (Josef) - block cgroup MAINTAINERS entry (Konstantin) - Flush init fix (Josef) - blk-throttle low iops timing fix (Konstantin) - nbd resize fixes (Mike) - nbd 0 blocksize crash fix (Xiubo) - block integrity error leak fix (Wenwen) - blk-cgroup writeback and priority inheritance fixes (Tejun)" * tag 'for-linus-20190715' of git://git.kernel.dk/linux-block: (42 commits) MAINTAINERS: add entry for block io cgroup null_blk: fixup ->report_zones() for !CONFIG_BLK_DEV_ZONED block: Limit zone array allocation size sd_zbc: Fix report zones buffer allocation block: Kill gfp_t argument of blkdev_report_zones() block: Allow mapping of vmalloc-ed buffers block/bio-integrity: fix a memory leak bug nvme: fix NULL deref for fabrics options nbd: add netlink reconfigure resize support nbd: fix crash when the blksize is zero block: Disable write plugging for zoned block devices block: Fix elevator name declaration block: Remove unused definitions nvme: fix regression upon hot device removal and insertion blk-throttle: fix zero wait time for iops throttled group block: Fix potential overflow in blk_report_zones() blkcg: implement REQ_CGROUP_PUNT blkcg, writeback: Implement wbc_blkcg_css() blkcg, writeback: Add wbc->no_cgroup_owner blkcg, writeback: Rename wbc_account_io() to wbc_account_cgroup_owner() ...	2019-07-15 21:20:52 -07:00
Tejun Heo	34e51a5e1a	blkcg, writeback: Rename wbc_account_io() to wbc_account_cgroup_owner() wbc_account_io() does a very specific job - try to see which cgroup is actually dirtying an inode and transfer its ownership to the majority dirtier if needed. The name is too generic and confusing. Let's rename it to something more specific. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-07-10 09:00:57 -06:00
Christoph Hellwig	8af54f291e	fs: fold __generic_write_end back into generic_write_end This effectively reverts `a6d639da63` ("fs: factor out a __generic_write_end helper") as we now open code what is left of that helper in iomap. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2019-06-27 17:28:40 -07:00
Thomas Gleixner	457c899653	treewide: Add SPDX license identifier for missed files Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-21 10:50:45 +02:00
Andreas Gruenbacher	7a77dad7e3	iomap: Fix use-after-free error in page_done callback In iomap_write_end, we're not holding a page reference anymore when calling the page_done callback, but the callback needs that reference to access the page. To fix that, move the put_page call in __generic_write_end into the callers of __generic_write_end. Then, in iomap_write_end, put the page after calling the page_done callback. Reported-by: Jan Kara <jack@suse.cz> Fixes: `63899c6f88` ("iomap: add a page_done callback") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2019-05-01 07:47:37 -07:00
Andreas Gruenbacher	26ddb1f4fd	fs: Turn __generic_write_end into a void function The VFS-internal __generic_write_end helper always returns the value of its @copied argument. This can be confusing, and it isn't very useful anyway, so turn __generic_write_end into a function returning void instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2019-05-01 07:47:37 -07:00
Carlos Maiolino	dce30ca9e3	fs: fix guard_bio_eod to check for real EOD errors guard_bio_eod() can truncate a segment in bio to allow it to do IO on odd last sectors of a device. It already checks if the IO starts past EOD, but it does not consider the possibility of an IO request starting within device boundaries can contain more than one segment past EOD. In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will underflow bvec->bv_len. Fix this by checking if truncated_bytes is lower than PAGE_SIZE. This situation has been found on filesystems such as isofs and vfat, which doesn't check the device size before mount, if the device is smaller than the filesystem itself, a readahead on such filesystem, which spans EOD, can trigger this situation, leading a call to zero_user() with a wrong size possibly corrupting memory. I didn't see any crash, or didn't let the system run long enough to check if memory corruption will be hit somewhere, but adding instrumentation to guard_bio_end() to check truncated_bytes size, was enough to see the error. The following script can trigger the error. MNT=/mnt IMG=./DISK.img DEV=/dev/loop0 mkfs.vfat $IMG mount $IMG $MNT cp -R /etc $MNT &> /dev/null umount $MNT losetup -D losetup --find --show --sizelimit 16247280 $IMG mount $DEV $MNT find $MNT -type f -exec cat {} + >/dev/null Kudos to Eric Sandeen for coming up with the reproducer above Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-02-28 13:59:41 -07:00
Jens Axboe	6fb845f0e7	Merge tag 'v5.0-rc6' into for-5.1/block Pull in 5.0-rc6 to avoid a dumb merge conflict with fs/iomap.c. This is needed since io_uring is now based on the block branch, to avoid a conflict between the multi-page bvecs and the bits of io_uring that touch the core block parts. * tag 'v5.0-rc6': (525 commits) Linux 5.0-rc6 x86/mm: Make set_pmd_at() paravirt aware MAINTAINERS: Update the ocores i2c bus driver maintainer, etc blk-mq: remove duplicated definition of blk_mq_freeze_queue Blk-iolatency: warn on negative inflight IO counter blk-iolatency: fix IO hang due to negative inflight counter MAINTAINERS: unify reference to xen-devel list x86/mm/cpa: Fix set_mce_nospec() futex: Handle early deadlock return correctly futex: Fix barrier comment net: dsa: b53: Fix for failure when irq is not defined in dt blktrace: Show requests without sector mips: cm: reprime error cause mips: loongson64: remove unreachable(), fix loongson_poweroff(). sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach() geneve: should not call rt6_lookup() when ipv6 was disabled KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221) KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222) kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974) signal: Better detection of synchronous signals ...	2019-02-15 08:43:59 -07:00
Ming Lei	f70f446407	fs/buffer.c: use bvec iterator to truncate the bio Once multi-page bvec is enabled, the last bvec may include more than one page, this patch use mp_bvec_last_segment() to truncate the bio. Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-02-15 08:40:11 -07:00
Tetsuo Handa	43636c804d	fs: ratelimit __find_get_block_slow() failure message. When something let __find_get_block_slow() hit all_mapped path, it calls printk() for 100+ times per a second. But there is no need to print same message with such high frequency; it is just asking for stall warning, or at least bloating log files. [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.873324][T15342] b_state=0x00000029, b_size=512 [ 399.878403][T15342] device loop0 blocksize: 4096 [ 399.883296][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.890400][T15342] b_state=0x00000029, b_size=512 [ 399.895595][T15342] device loop0 blocksize: 4096 [ 399.900556][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8 [ 399.907471][T15342] b_state=0x00000029, b_size=512 [ 399.912506][T15342] device loop0 blocksize: 4096 This patch reduces frequency to up to once per a second, in addition to concatenating three lines into one. [ 399.866302][T15342] __find_get_block_slow() failed. block=1, b_blocknr=8, b_state=0x00000029, b_size=512, device loop0 blocksize: 4096 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-02-06 12:58:56 -07:00
Davidlohr Bueso	08d405c8b8	fs/: remove caller signal_pending branch predictions This is already done for us internally by the signal machinery. [akpm@linux-foundation.org: fix fs/buffer.c] Link: http://lkml.kernel.org/r/20181116002713.8474-7-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-01-04 13:13:48 -08:00
Dennis Zhou	fd42df305f	blkcg: associate writeback bios with a blkg One of the goals of this series is to remove a separate reference to the css of the bio. This can and should be accessed via bio_blkcg(). In this patch, wbc_init_bio() now requires a bio to have a device associated with it. Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-12-07 22:26:37 -07:00
Linus Torvalds	5f21585384	Merge tag 'for-linus-20181102' of git://git.kernel.dk/linux-block Pull block layer fixes from Jens Axboe: "The biggest part of this pull request is the revert of the blkcg cleanup series. It had one fix earlier for a stacked device issue, but another one was reported. Rather than play whack-a-mole with this, revert the entire series and try again for the next kernel release. Apart from that, only small fixes/changes. Summary: - Indentation fixup for mtip32xx (Colin Ian King) - The blkcg cleanup series revert (Dennis Zhou) - Two NVMe fixes. One fixing a regression in the nvme request initialization in this merge window, causing nvme-fc to not work. The other is a suspend/resume p2p resource issue (James, Keith) - Fix sg discard merge, allowing us to merge in cases where we didn't before (Jianchao Wang) - Call rq_qos_exit() after the queue is frozen, preventing a hang (Ming) - Fix brd queue setup, fixing an oops if we fail setting up all devices (Ming)" * tag 'for-linus-20181102' of git://git.kernel.dk/linux-block: nvme-pci: fix conflicting p2p resource adds nvme-fc: fix request private initialization blkcg: revert blkcg cleanups series block: brd: associate with queue until adding disk block: call rq_qos_exit() after queue is frozen mtip32xx: clean an indentation issue, remove extraneous tabs block: fix the DISCARD request merge	2018-11-02 11:25:48 -07:00
Dennis Zhou	b5f2954d30	blkcg: revert blkcg cleanups series This reverts a series committed earlier due to null pointer exception bug report in [1]. It seems there are edge case interactions that I did not consider and will need some time to understand what causes the adverse interactions. The original series can be found in [2] with a follow up series in [3]. [1] https://www.spinics.net/lists/cgroups/msg20719.html [2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/ [3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/ This reverts the following commits: `d459d853c2`, `b2c3fa5467`, `101246ec02`, `b3b9f24f5f`, `e2b0989954`, `f0fcb3ec89`, `c839e7a03f`, `bdc2491708`, `74b7c02a9b`, `5bf9a1f3b4`, `a7b39b4e96`, `07b05bcc32`, `49f4c2dc2b`, `27e6fa996c` Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-11-01 19:59:53 -06:00

1 2 3 4 5 ...

420 Commits