android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Fengnan Chang	ed493d61fe	FROMGIT: f2fs: fix to use WHINT_MODE Since active_logs can be set to 2 or 4 or NR_CURSEG_PERSIST_TYPE(6), it cannot be set to NR_CURSEG_TYPE(8). That is, whint_mode is always off. Therefore, the condition is changed from NR_CURSEG_TYPE to NR_CURSEG_PERSIST_TYPE. Bug: 202812742 Cc: Chao Yu <chao@kernel.org> Fixes: `d0b9e42ab6` (f2fs: introduce inmem curseg) Reported-by: tanghuan <tanghuan@vivo.com> Signed-off-by: Keoseong Park <keosung.park@samsung.com> Signed-off-by: Fengnan Chang <changfengnan@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit 011e0868e0cf1237675b22e36fffa958fb08f46e https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/ dev) Change-Id: Ic2a2b055e1f37f6263d8243527e83fe6bd4f7070 Signed-off-by: Fengnan Chang <changfengnan@vivo.com>	2021-10-14 04:22:27 +00:00
Gao Xiang	dcd77f0b74	UPSTREAM: erofs: fix 1 lcluster-sized pcluster for big pcluster If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type rather than a HEAD or PLAIN type instead, which means its pclustersize _must_ be 1 lcluster (since its uncompressed size < 2 lclusters), as illustrated below: HEAD HEAD / PLAIN lcluster type ____________ ____________ \|_:__________\|_________:__\| file data (uncompressed) . . .____________. \|____________\| pcluster data (compressed) Such on-disk case was explained before [1] but missed to be handled properly in the runtime implementation. It can be observed if manually generating 1 lcluster-sized pcluster with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now. [1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org Fixes: cec6e93beadf ("erofs: support parsing big pcluster compress indexes") Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <xiang@kernel.org> Bug: 201372112 Change-Id: I7e46baa993790f8908287ac36e19d32e536116db (cherry picked from commit 0852b6ca941ef3ff75076e85738877bd3271e1cd) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:33:46 +00:00
Gao Xiang	e085d3f0d0	UPSTREAM: erofs: enable big pcluster feature Enable COMPR_CFGS and BIG_PCLUSTER since the implementations are all settled properly. Link: https://lore.kernel.org/r/20210407043927.10623-11-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I85a9045c146b6877eb371904e82d0481e21bbb75 (cherry picked from commit 8e6c8fa9f2e95c88a642521a5da19a8e31748846) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:33:20 +00:00
Gao Xiang	ed0607cc52	UPSTREAM: erofs: support decompress big pcluster for lz4 backend Prior to big pcluster, there was only one compressed page so it'd easy to map this. However, when big pcluster is enabled, more work needs to be done to handle multiple compressed pages. In detail, - (maptype 0) if there is only one compressed page + no need to copy inplace I/O, just map it directly what we did before; - (maptype 1) if there are more compressed pages + no need to copy inplace I/O, vmap such compressed pages instead; - (maptype 2) if inplace I/O needs to be copied, use per-CPU buffers for decompression then. Another thing is how to detect inplace decompression is feasable or not (it's still quite easy for non big pclusters), apart from the inplace margin calculation, inplace I/O page reusing order is also needed to be considered for each compressed page. Currently, if the compressed page is the xth page, it shouldn't be reused as [0 ... nrpages_out - nrpages_in + x], otherwise a full copy will be triggered. Although there are some extra optimization ideas for this, I'd like to make big pcluster work correctly first and obviously it can be further optimized later since it has nothing with the on-disk format at all. Link: https://lore.kernel.org/r/20210407043927.10623-10-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I8e8b90bd0a401850ea81d895dc55e5d8a2772ed7 (cherry picked from commit 598162d050801e556750defff4ddab499e5d76ed) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:32:20 +00:00
Gao Xiang	d34cb6cdc0	UPSTREAM: erofs: support parsing big pcluster compact indexes Different from non-compact indexes, several lclusters are packed as the compact form at once and an unique base blkaddr is stored for each pack, so each lcluster index would take less space on avarage (e.g. 2 bytes for COMPACT_2B.) btw, that is also why BIG_PCLUSTER switch should be consistent for compact head0/1. Prior to big pcluster, the size of all pclusters was 1 lcluster. Therefore, when a new HEAD lcluster was scanned, blkaddr would be bumped by 1 lcluster. However, that way doesn't work anymore for big pcluster since we actually don't know the compressed size of pclusters in advance (before reading CBLKCNT lcluster). So, instead, let blkaddr of each pack be the first pcluster blkaddr with a valid CBLKCNT, in detail, 1) if CBLKCNT starts at the pack, this first valid pcluster is itself, e.g. _____________________________________________________________ \|_CBLKCNT0_\|_NONHEAD_\| .. \|_HEAD_\|_CBLKCNT1_\| ... \|_HEAD_\| ... ^ = blkaddr base ^ += CBLKCNT0 ^ += CBLKCNT1 2) if CBLKCNT doesn't start at the pack, the first valid pcluster is the next pcluster, e.g. _________________________________________________________ \| NONHEAD_\| .. \|_HEAD_\|_CBLKCNT0_\| ... \|_HEAD_\|_HEAD_\| ... ^ = blkaddr base ^ += CBLKCNT0 ^ += 1 When a CBLKCNT is found, blkaddr will be increased by CBLKCNT lclusters, or a new HEAD is found immediately, bump blkaddr by 1 instead (see the picture above.) Also noted if CBLKCNT is the end of the pack, instead of storing delta1 (distance of the next HEAD lcluster) as normal NONHEADs, it still uses the compressed block count (delta0) since delta1 can be calculated indirectly but the block count can't. Adjust decoding logic to fit big pcluster compact indexes as well. Link: https://lore.kernel.org/r/20210407043927.10623-9-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I09d6bb4c9d390dc6169cb9dd4efbc7fd6b3be5d0 (cherry picked from commit b86269f43892316ef5a177d7180d09d101a46f22) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:31:56 +00:00
Gao Xiang	051d76b899	UPSTREAM: erofs: support parsing big pcluster compress indexes When INCOMPAT_BIG_PCLUSTER sb feature is enabled, legacy compress indexes will also have the same on-disk header compact indexes to keep per-file configurations instead of leaving it zeroed. If ADVISE_BIG_PCLUSTER is set for a file, CBLKCNT will be loaded for each pcluster in this file by parsing 1st non-head lcluster. Link: https://lore.kernel.org/r/20210407043927.10623-8-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I97b8d1cc54e057789d22f1cdfafc21ed6a69b149 (cherry picked from commit cec6e93beadfd145758af2c0854fcc2abb8170cb) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:31:46 +00:00
Gao Xiang	d149931601	UPSTREAM: erofs: adjust per-CPU buffers according to max_pclusterblks Adjust per-CPU buffers on demand since big pcluster definition is available. Also, bail out unsupported pcluster size according to Z_EROFS_PCLUSTER_MAX_SIZE. Link: https://lore.kernel.org/r/20210407043927.10623-7-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I86665be0519f328614d93aa24cb06043655576b9 (cherry picked from commit 4fea63f7d76e425965033938bab6488e48579e3f) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:31:21 +00:00
Gao Xiang	95a1d5df84	UPSTREAM: erofs: add big physical cluster definition Big pcluster indicates the size of compressed data for each physical pcluster is no longer fixed as block size, but could be more than 1 block (more accurately, 1 logical pcluster) When big pcluster feature is enabled for head0/1, delta0 of the 1st non-head lcluster index will keep block count of this pcluster in lcluster size instead of 1. Or, the compressed size of pcluster should be 1 lcluster if pcluster has no non-head lcluster index. Also note that BIG_PCLUSTER feature reuses COMPR_CFGS feature since it depends on COMPR_CFGS and will be released together. Link: https://lore.kernel.org/r/20210407043927.10623-6-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: Ibd33f2764f4420f370f9293ed325efdba9ea70a7 (cherry picked from commit 5404c33010cb8ee063c05376d4a2eba129872281) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:30:24 +00:00
Gao Xiang	8043aaed1d	UPSTREAM: erofs: fix up inplace I/O pointer for big pcluster When picking up inplace I/O pages, it should be traversed in reverse order in aligned with the traversal order of file-backed online pages. Also, index should be updated together when preloading compressed pages. Previously, only page-sized pclustersize was supported so no problem at all. Also rename `compressedpages' to `icpage_ptr' to reflect its functionality. Link: https://lore.kernel.org/r/20210407043927.10623-5-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I752769d004c18f74636bcfdea767572daa6f7072 (cherry picked from commit 81382f5f5cb0c9c5694c19d36460f757a8c96841) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:29:52 +00:00
Gao Xiang	6ad2f8f169	UPSTREAM: erofs: introduce physical cluster slab pools Since multiple pcluster sizes could be used at once, the number of compressed pages will become a variable factor. It's necessary to introduce slab pools rather than a single slab cache now. This limits the pclustersize to 1M (Z_EROFS_PCLUSTER_MAX_SIZE), and get rid of the obsolete EROFS_FS_CLUSTER_PAGE_LIMIT, which has no use now. Link: https://lore.kernel.org/r/20210407043927.10623-4-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I821f3cf35820dbb320eeedc3ae934fd1d455dfd7 (cherry picked from commit 9f6cc76e6ff0631a99cd94eab8af137057633a52) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:29:22 +00:00
Gao Xiang	432f58b100	UPSTREAM: erofs: introduce multipage per-CPU buffers To deal the with the cases which inplace decompression is infeasible for some inplace I/O. Per-CPU buffers was introduced to get rid of page allocation latency and thrash for low-latency decompression algorithms such as lz4. For the big pcluster feature, introduce multipage per-CPU buffers to keep such inplace I/O pclusters temporarily as well but note that per-CPU pages are just consecutive virtually. When a new big pcluster fs is mounted, its max pclustersize will be read and per-CPU buffers can be growed if needed. Shrinking adjustable per-CPU buffers is more complex (because we don't know if such size is still be used), so currently just release them all when unloading. Link: https://lore.kernel.org/r/20210409190630.19569-1-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I5b3ab69ef58aaea635911b0c08cceeb4b38122da (cherry picked from commit 524887347fcb67faa0a63dd3c4c02ab48d4968d4) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:28:24 +00:00
Vladimir Zapolskiy	571c9a0bd3	UPSTREAM: erofs: remove a void EROFS_VERSION macro set in Makefile Since commit `4f761fa253` ("erofs: rename errln/infoln/debugln to erofs_{err, info, dbg}") the defined macro EROFS_VERSION has no affect, therefore removing it from the Makefile is a non-functional change. Link: https://lore.kernel.org/r/20201030122839.25431-1-vladimir@tuxera.com Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I06df2883e85d4bfdc08bccb7aa7c014570cd156b (cherry picked from commit a426ce9d6751cc8e709f031fa546900e4239f125) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:27:55 +00:00
Gao Xiang	431d73396d	UPSTREAM: erofs: reserve physical_clusterbits[] Formal big pcluster design is actually more powerful / flexable than the previous thought whose pclustersize was fixed as power-of-2 blocks, which was obviously inefficient and space-wasting. Instead, pclustersize can now be set independently for each pcluster, so various pcluster sizes can also be used together in one file if mkfs wants (for example, according to data type and/or compression ratio). Let's get rid of previous physical_clusterbits[] setting (also notice that corresponding on-disk fields are still 0 for now). Therefore, head1/2 can be used for at most 2 different algorithms in one file and again pclustersize is now independent of these. Link: https://lore.kernel.org/r/20210407043927.10623-2-xiang@kernel.org Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I7db236f06b202949b882a366b347f0c4e9dc0c3e (cherry picked from commit 54e0b6c873dcbd02b9b479c893f6fba8fcbc6a9c) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:27:22 +00:00
Ruiqi Gong	89dbc6246a	UPSTREAM: erofs: Clean up spelling mistakes found in fs/erofs zmap.c: s/correspoinding/corresponding zdata.c: s/endding/ending Link: https://lore.kernel.org/r/20210331093920.31923-1-gongruiqi1@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Reviewed-by: Gao Xiang <hsiangkao@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: Ie848ca314462a8093fe94405b260c12b23b663b9 (cherry picked from commit fe6adcce7e297fcb49f40c62df42334690c3f848) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:26:25 +00:00
Gao Xiang	ac1f14e9d5	UPSTREAM: erofs: add on-disk compression configurations Add a bitmap for available compression algorithms and a variable-sized on-disk table for compression options in preparation for upcoming big pcluster and LZMA algorithm, which follows the end of super block. To parse the compression options, the bitmap is scanned one by one. For each available algorithm, there is data followed by 2-byte `length' correspondingly (it's enough for most cases, or entire fs blocks should be used.) With such available algorithm bitmap, kernel itself can also refuse to mount such filesystem if any unsupported compression algorithm exists. Note that COMPR_CFGS feature will be enabled with BIG_PCLUSTER. Link: https://lore.kernel.org/r/20210329100012.12980-1-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I9f9c69d5cd05da8c5f17e3650511e83c5f89200e (cherry picked from commit 14373711dd54be8a84e2f4f624bc58787f80cfbd) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:26:06 +00:00
Gao Xiang	cd21e62366	UPSTREAM: erofs: introduce on-disk lz4 fs configurations Introduce z_erofs_lz4_cfgs to store all lz4 configurations. Currently it's only max_distance, but will be used for new features later. Link: https://lore.kernel.org/r/20210329012308.28743-4-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: Ib9873b530bdc82f8cc5e79ac85ede737d883ff12 (cherry picked from commit 46249cded18ac0c4ffb7b177219510a133a51c00) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:25:23 +00:00
Gao Xiang	e17fd2ac9d	UPSTREAM: erofs: introduce erofs_sb_has_xxx() helpers Introduce erofs_sb_has_xxx() to make long checks short, especially for later big pcluster & LZMA features. Link: https://lore.kernel.org/r/20210329012308.28743-2-hsiangkao@aol.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I145fa9c670284b609b59a246f918fb09dc562356 (cherry picked from commit de06a6a375414be03ce5b1054f2d836591923a1d) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:24:39 +00:00
Yue Hu	ba1a3d1fb2	UPSTREAM: erofs: don't use erofs_map_blocks() any more Currently, erofs_map_blocks() will be called only from erofs_{bmap, read_raw_page} which are all for uncompressed files. So, the compression branch in erofs_map_blocks() is pointless. Let's remove it and use erofs_map_blocks_flatmode() directly. Also update related comments. Link: https://lore.kernel.org/r/20210325071008.573-1-zbestahu@gmail.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: Ic3362c62d7c917f05f1cd11120ee6280a22221b8 (cherry picked from commit 8137824eddd2e790c61c70c20d70a087faca95fa) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:23:24 +00:00
Gao Xiang	384b2cdaf8	UPSTREAM: erofs: complete a missing case for inplace I/O Add a missing case which could cause unnecessary page allocation but not directly use inplace I/O instead, which increases runtime extra memory footprint. The detail is, considering an online file-backed page, the right half of the page is chosen to be cached (e.g. the end page of a readahead request) and some of its data doesn't exist in managed cache, so the pcluster will be definitely kept in the submission chain. (IOWs, it cannot be decompressed without I/O, e.g., due to the bypass queue). Currently, DELAYEDALLOC/TRYALLOC cases can be downgraded as NOINPLACE, and stop online pages from inplace I/O. After this patch, unneeded page allocations won't be observed in pickup_page_for_submission() then. Link: https://lore.kernel.org/r/20210321183227.5182-1-hsiangkao@aol.com Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Bug: 201372112 Change-Id: I639279e87b749b8625d2a8e77055a9fc52073542 (cherry picked from commit 0b964600d3aae56ff9d5bdd710a79f39a44c572c) Signed-off-by: Huang Jianan <huangjianan@oppo.com>	2021-09-28 08:21:51 +00:00
Minchan Kim	a9ac6ae90e	BACKPORT: UPSTREAM: mm: fs: invalidate bh_lrus for only cold path kernel test robot reported the regression of fio.write_iops[1] with [2]. Since lru_add_drain is called frequently, invalidate bh_lrus there could increase bh_lrus cache miss ratio, which needs more IO in the end. This patch moves the bh_lrus invalidation from the hot path( e.g., zap_page_range, pagevec_release) to cold path(i.e., lru_add_drain_all, lru_cache_disable). "Xing, Zhengjun" confirmed : I test the patch, the regression reduced to -2.9%. [1] https://lore.kernel.org/lkml/20210520083144.GD14190@xsang-OptiPlex-9020/ [2] 8cc621d2f45d, mm: fs: invalidate BH LRU during page migration Bug: 194673488 Link: https://lkml.kernel.org/r/20210907212347.1977686-1-minchan@kernel.org (cherry picked from commit 243418e3925d5b5b0657ae54c322d43035e97eed) [Chris: resolved conflicts due to Minchan's AOSP LRU commits] Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Chris Goldsworthy <cgoldswo@codeaurora.org> Tested-by: "Xing, Zhengjun" <zhengjun.xing@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com> Change-Id: Icc5e456b058df516480b4378853464d6d7b43505	2021-09-27 17:48:37 -07:00
Alessio Balsini	cfc0a49c73	ANDROID: fs/fuse: Keep FUSE file times consistent with lower file When FUSE passthrough is used, the lower file system file is manipulated directly, but neither mtime, atime or ctime of the referencing FUSE file is updated. Fix by updating the file times when passthrough operations are performed. Bug: 200779468 Reported-by: Fengnan Chang <changfengnan@vivo.com> Reported-by: Ed Tsai <ed.tsai@mediatek.com> Signed-off-by: Alessio Balsini <balsini@google.com> Change-Id: I35b72196b2cc1d79a9f62ddb32e2cfa934c3b6d3	2021-09-24 13:30:08 +00:00
Biao Li	4652709913	ANDROID: fuse: Allocate zeroed memory for canonical path The page used to contain the fuse_dentry_canonical_path to be handled in fuse_dev_do_write is allocated using __get_free_pages(GFP_KERNEL). The returned page may contain undefined data, that by chance may be considered as a valid path name that is not in the cache. In that case, if the FUSE daemon mistakenly doesn't fill the canonical path buffer, the FUSE driver may fall into two blocking request_wait_answer(fuse_dev_write->kern_path->fuse_lookup_name) causing a deadlock condition. The stack is as follows： find S 0 20511 20117 0x00000000 Call trace: [<ffffff8008085e78>] __switch_to+0xb8/0xd4 [<ffffff8008a0cac4>] __schedule+0x458/0x714 [<ffffff8008a0ce0c>] schedule+0x8c/0xa8 [<ffffff800833865c>] request_wait_answer+0x74/0x220 [<ffffff8008339f70>] __fuse_request_send+0x8c/0xa0 [<ffffff8008339fe4>] fuse_request_send+0x60/0x6c [<ffffff800833c1a8>] fuse_dentry_canonical_path+0xb8/0x104 [<ffffff800820b14c>] do_sys_open+0x1b4/0x260 [<ffffff800820b27c>] SyS_openat+0x3c/0x4c [<ffffff8008083540>] el0_svc_naked+0x34/0x38 mount.ntfs-3g S 0 5845 1 0x00000000 Call trace: [<ffffff8008085e78>] __switch_to+0xb8/0xd4 [<ffffff8008a0cac4>] __schedule+0x458/0x714 [<ffffff8008a0ce0c>] schedule+0x8c/0xa8 [<ffffff800833865c>] request_wait_answer+0x74/0x220 [<ffffff8008339f70>] __fuse_request_send+0x8c/0xa0 [<ffffff8008339fe4>] fuse_request_send+0x60/0x6c [<ffffff800833bdb0>] fuse_simple_request+0x128/0x16c [<ffffff800833dddc>] fuse_lookup_name+0x104/0x1b0 [<ffffff800833dee4>] fuse_lookup+0x5c/0x11c [<ffffff800821861c>] lookup_slow+0xfc/0x174 [<ffffff800821b474>] walk_component+0xf0/0x290 [<ffffff800821bbac>] path_lookupat+0xa0/0x128 [<ffffff800821c7f4>] filename_lookup+0x84/0x124 [<ffffff800821c8d8>] kern_path+0x44/0x54 [<ffffff800833b0c8>] fuse_dev_do_write+0x828/0xa0c [<ffffff800833b610>] fuse_dev_write+0x90/0xb4 [<ffffff800820b770>] do_iter_readv_writev+0xf4/0x13c [<ffffff800820cc88>] do_readv_writev+0xec/0x220 [<ffffff800820d05c>] vfs_writev+0x60/0x74 [<ffffff800820d0ec>] do_writev+0x7c/0x100 [<ffffff800820e348>] SyS_writev+0x38/0x48 [<ffffff8008083540>] el0_svc_naked+0x34/0x38 Fix by ensuring that the page allocated for the canonical path is zeroed. Bug: 194856119 Bug: 196051870 Fixes: 24ab59f6bb42 ("ANDROID: fuse: Add support for d_canonical_path") Signed-off-by: Biao Li <libiao@allwinnertech.com> Signed-off-by: Shuosheng Huang <huangshuosheng@allwinnertech.com> Signed-off-by: Alessio Balsini <balsini@google.com> Change-Id: I400815dc1049d90c308f5cf87ce60de97ff82131	2021-09-17 09:18:17 +01:00
Jaegeuk Kim	96db9b84a6	FROMGIT: f2fs: should use GFP_NOFS for directory inodes We use inline_dentry which requires to allocate dentry page when adding a link. If we allow to reclaim memory from filesystem, we do down_read(&sbi->cp_rwsem) twice by f2fs_lock_op(). I think this should be okay, but how about stopping the lockdep complaint [1]? f2fs_create() - f2fs_lock_op() - f2fs_do_add_link() - __f2fs_find_entry - f2fs_get_read_data_page() -> kswapd - shrink_node - f2fs_evict_inode - f2fs_lock_op() [1] fs_reclaim ){+.+.}-{0:0} : kswapd0: lock_acquire+0x114/0x394 kswapd0: __fs_reclaim_acquire+0x40/0x50 kswapd0: prepare_alloc_pages+0x94/0x1ec kswapd0: __alloc_pages_nodemask+0x78/0x1b0 kswapd0: pagecache_get_page+0x2e0/0x57c kswapd0: f2fs_get_read_data_page+0xc0/0x394 kswapd0: f2fs_find_data_page+0xa4/0x23c kswapd0: find_in_level+0x1a8/0x36c kswapd0: __f2fs_find_entry+0x70/0x100 kswapd0: f2fs_do_add_link+0x84/0x1ec kswapd0: f2fs_mkdir+0xe4/0x1e4 kswapd0: vfs_mkdir+0x110/0x1c0 kswapd0: do_mkdirat+0xa4/0x160 kswapd0: __arm64_sys_mkdirat+0x24/0x34 kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8 kswapd0: do_el0_svc+0x28/0xa0 kswapd0: el0_svc+0x24/0x38 kswapd0: el0_sync_handler+0x88/0xec kswapd0: el0_sync+0x1c0/0x200 kswapd0: -> #1 ( &sbi->cp_rwsem ){++++}-{3:3} : kswapd0: lock_acquire+0x114/0x394 kswapd0: down_read+0x7c/0x98 kswapd0: f2fs_do_truncate_blocks+0x78/0x3dc kswapd0: f2fs_truncate+0xc8/0x128 kswapd0: f2fs_evict_inode+0x2b8/0x8b8 kswapd0: evict+0xd4/0x2f8 kswapd0: iput+0x1c0/0x258 kswapd0: do_unlinkat+0x170/0x2a0 kswapd0: __arm64_sys_unlinkat+0x4c/0x68 kswapd0: el0_svc_common.llvm.17258447499513131576+0xc4/0x1e8 kswapd0: do_el0_svc+0x28/0xa0 kswapd0: el0_svc+0x24/0x38 kswapd0: el0_sync_handler+0x88/0xec kswapd0: el0_sync+0x1c0/0x200 Bug: 198991863 Cc: stable@vger.kernel.org Fixes: `bdbc90fa55` ("f2fs: don't put dentry page in pagecache into highmem") Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Light Hsieh <light.hsieh@mediatek.com> Tested-by: Light Hsieh <light.hsieh@mediatek.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit 92d602bc7177325e7453189a22e0c8764ed3453e https://https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev) Change-Id: Ie25d567390ea5185955356b330e04ebfa5c8836f	2021-09-16 08:46:29 -07:00
Jaegeuk Kim	90c60a51f5	UPSTREAM: f2fs: guarantee to write dirty data when enabling checkpoint back We must flush all the dirty data when enabling checkpoint back. Let's guarantee that first by adding a retry logic on sync_inodes_sb(). In addition to that, this patch adds to flush data in fsync when checkpoint is disabled, which can mitigate the sync_inodes_sb() failures in advance. Bug: 194449609 Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit dddd3d65293a52c2c3850c19b1e5115712e534d8) Change-Id: I5bbef7386ddbb44fd925262fb68a8ef0a4960993	2021-09-09 10:41:52 +00:00
Linus Torvalds	3de34cc5ea	UPSTREAM: pipe: make pipe writes always wake up readers Since commit `1b6b26ae70` ("pipe: fix and clarify pipe write wakeup logic") we have sanitized the pipe write logic, and would only try to wake up readers if they needed it. In particular, if the pipe already had data in it before the write, there was no point in trying to wake up a reader, since any existing readers must have been aware of the pre-existing data already. Doing extraneous wakeups will only cause potential thundering herd problems. However, it turns out that some Android libraries have misused the EPOLL interface, and expected "edge triggered" be to "any new write will trigger it". Even if there was no edge in sight. Quoting Sandeep Patil: "The commit `1b6b26ae70` ('pipe: fix and clarify pipe write wakeup logic') changed pipe write logic to wakeup readers only if the pipe was empty at the time of write. However, there are libraries that relied upon the older behavior for notification scheme similar to what's described in [1] One such library 'realm-core'[2] is used by numerous Android applications. The library uses a similar notification mechanism as GNU Make but it never drains the pipe until it is full. When Android moved to v5.10 kernel, all applications using this library stopped working. The library has since been fixed[3] but it will be a while before all applications incorporate the updated library" Our regression rule for the kernel is that if applications break from new behavior, it's a regression, even if it was because the application did something patently wrong. Also note the original report [4] by Michal Kerrisk about a test for this epoll behavior - but at that point we didn't know of any actual broken use case. So add the extraneous wakeup, to approximate the old behavior. [ I say "approximate", because the exact old behavior was to do a wakeup not for each write(), but for each pipe buffer chunk that was filled in. The behavior introduced by this change is not that - this is just "every write will cause a wakeup, whether necessary or not", which seems to be sufficient for the broken library use. ] It's worth noting that this adds the extraneous wakeup only for the write side, while the read side still considers the "edge" to be purely about reading enough from the pipe to allow further writes. See commit `f467a6a664` ("pipe: fix and clarify pipe read wakeup logic") for the pipe read case, which remains that "only wake up if the pipe was full, and we read something from it". Link: https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0bam6w@mail.gmail.com/ [1] Link: https://github.com/realm/realm-core [2] Link: https://github.com/realm/realm-core/issues/4666 [3] Link: https://lore.kernel.org/lkml/CAKgNAkjMBGeAwF=2MKK758BhxvW58wYTgYKB2V-gY1PwXxrH+Q@mail.gmail.com/ [4] Link: https://lore.kernel.org/lkml/20210729222635.2937453-1-sspatil@android.com/ Bug: 193851993 Reported-by: Sandeep Patil <sspatil@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 3a34b13a88caeb2800ab44a4918f230041b37dd9) Signed-off-by: Sandeep Patil <sspatil@android.com> Change-Id: Idcf3e8faa31bff47ada4b815237a355e0757b964	2021-08-03 21:20:32 +00:00
Sandeep Patil	6b7e007164	ANDROID: Revert "ANDROID: fs: pipe: wakeup readers on small writes even if pipe had data" This reverts commit `76879a1964` ("ANDROID: fs: pipe: wakeup readers on small writes even if pipe had data") to replace that with the bug fix that landed upstream at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a34b13a88caeb2800ab44a4918f230041b37dd9 Bug: 193851993 Test: Build and boot cuttlefish. Signed-off-by: Sandeep Patil <sspatil@android.com> Change-Id: Ic4f5e2cc516b4ea68ae7d63225d1529217990431	2021-08-03 21:20:13 +00:00
Kalesh Singh	36fbb55631	FROMGIT: procfs: prevent unpriveleged processes accessing fdinfo dir The file permissions on the fdinfo dir from were changed from S_IRUSR\|S_IXUSR to S_IRUGO\|S_IXUGO, and a PTRACE_MODE_READ check was added for opening the fdinfo files [1]. However, the ptrace permission check was not added to the directory, allowing anyone to get the open FD numbers by reading the fdinfo directory. Add the missing ptrace permission check for opening the fdinfo directory. [1] https://lkml.kernel.org/r/20210308170651.919148-1-kaleshsingh@google.com Link: https://lkml.kernel.org/r/20210713162008.1056986-1-kaleshsingh@google.com Fixes: 7bc3fa0172a4 ("procfs: allow reading fdinfo with PTRACE_MODE_READ") Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Christian König <christian.koenig@amd.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Hridya Valsaraju <hridya@google.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mark Brown <broonie@kernel.org> Bug: 151772539 Change-Id: I274b30aa0a5ce8412eae7161d31c6ee955035da9 (cherry picked from commit fc73829fa54b0c7af32d6da7c972eb3390957da4 git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master) Signed-off-by: Kalesh Singh <kaleshsingh@google.com>	2021-07-29 15:10:21 -04:00
Jaegeuk Kim	fc2d64ec5d	FROMGIT: f2fs: don't sleep while grabing nat_tree_lock This tries to fix priority inversion in the below condition resulting in long checkpoint delay. f2fs_get_node_info() - nat_tree_lock -> sleep to grab journal_rwsem by contention checkpoint - waiting for nat_tree_lock In order to let checkpoint go, let's release nat_tree_lock, if there's a journal_rwsem contention. Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Bug: 191987855 (cherry picked from commit 2eeb0dce728a7eac3e4dfe355d98af40d61f7a26 git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev) Change-Id: I97ac4f9d3bde399ab4f17f5b3a6e949ae9b79f0f Signed-off-by: Daeho Jeong <daehojeong@google.com>	2021-07-29 01:29:53 +00:00
Sandeep Patil	76879a1964	ANDROID: fs: pipe: wakeup readers on small writes even if pipe had data commit '1b6b26ae7053 ("pipe: fix and clarify pipe write wakeup logic")' change `pipe_write()` wakeup logic to wakeup readers only if the pipe was empty. This meant that applications that are not draining the pipe before each write were exposed to unexpected timeouts / hangs in epoll_wait() waiting for data in a pipe using EPOLLIN \| EPOLLET flags. This behaviour can be easily tested with android12-5.4 kernel where the test that uses pipes for notifications in this way works while it fails 100% with android12-5.10. This change restores the old behavior to wakeup all pipe_readers if any new data is written to the pipe. Bug: 193851993 Bug: 193846582 Change-Id: If0c5a844091ccf16d5236bd072326325d4d5447a Signed-off-by: Sandeep Patil <sspatil@google.com>	2021-07-27 17:35:49 +00:00
Jaegeuk Kim	e33cf9dd43	FROMGIT: f2fs: let's keep writing IOs on SBI_NEED_FSCK SBI_NEED_FSCK is an indicator that fsck.f2fs needs to be triggered, so it is not fully critical to stop any IO writes. So, let's allow to write data instead of reporting EIO forever given SBI_NEED_FSCK, but do keep OPU. Bug: 193659742 Fixes: 955772787667 ("f2fs: drop inplace IO if fs status is abnormal") Cc: <stable@kernel.org> # v5.13+ Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (cherry picked from commit 1ffc8f5f7751f91fe6af527d426a723231b741a6 git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev) Change-Id: I9585358c1cee864064d9d210ee643f49ff1e6749	2021-07-22 23:53:01 +00:00
Matthias Maennich	d0a88ae479	ANDROID: Enable GKI Dr. No Enforcement This effectively locks down OWNERS approval to a small group to guard the code base against unintentional breakages. Bug: 194314089 Signed-off-by: Matthias Maennich <maennich@google.com> Change-Id: Ifd1ea97639a622320ea83f901f6451e2e52b38d4	2021-07-21 20:51:47 +01:00
Daeho Jeong	11cec52238	FROMGIT: f2fs: add sysfs nodes to get GC info for each GC mode Added gc_reclaimed_segments and gc_segment_mode sysfs nodes. 1) "gc_reclaimed_segments" shows how many segments have been reclaimed by GC during a specific GC mode. 2) "gc_segment_mode" is used to control for which gc mode the "gc_reclaimed_segments" node shows. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Bug: 182708936 (cherry picked from commit 07c6b5933ebf58b6132aea9f3e72a62486882bfb git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev) Change-Id: Ie8c2ccf5d36ce2f388c98b77e22848f3ff6645c3 Signed-off-by: Daeho Jeong <daehojeong@google.com>	2021-07-16 00:21:51 +00:00
Prasad Sodagudi	9b136eab76	ANDROID: pstore/ram: Add backward compatibility for ramoops reserved region Some of the platforms might be still expecting dedicated memory region for ramoops node. So add logic to detect the start and size of the ramoops memory region by looking up reserved memory region with of_reserved_mem_lookup() when platform_get_resource() failed. Bug: 191636717 Change-Id: Idc479b45fb3f637f7235efd6eabac62059d5e92b Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>	2021-07-15 18:27:15 +00:00
Kalesh Singh	1f0c32a667	UPSTREAM: procfs/dmabuf: add inode number to /proc//fdinfo And 'ino' field to /proc/<pid>/fdinfo/<FD> and /proc/<pid>/task/<tid>/fdinfo/<FD>. The inode numbers can be used to uniquely identify DMA buffers in user space and avoids a dependency on /proc/<pid>/fd/ when accounting per-process DMA buffer sizes. Link: https://lkml.kernel.org/r/20210308170651.919148-2-kaleshsingh@google.com Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Christian König <christian.koenig@amd.com> Cc: Jann Horn <jannh@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Hridya Valsaraju <hridya@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Cc: Szabolcs Nagy <szabolcs.nagy@arm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Michel Lespinasse <walken@google.com> Cc: Bernd Edlinger <bernd.edlinger@hotmail.de> Cc: Andrei Vagin <avagin@gmail.com> Cc: Helge Deller <deller@gmx.de> Cc: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 3845f256a8b527127bfbd4ced21e93d9e89aa6d7) Bug: 159126739 Bug: 167141117 Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Change-Id: Id07beac3edcc95c0b42805e24e5486965acbb46e	2021-07-12 22:38:22 +00:00
Kalesh Singh	0c8c125f57	UPSTREAM: procfs: allow reading fdinfo with PTRACE_MODE_READ Android captures per-process system memory state when certain low memory events (e.g a foreground app kill) occur, to identify potential memory hoggers. In order to measure how much memory a process actually consumes, it is necessary to include the DMA buffer sizes for that process in the memory accounting. Since the handle to DMA buffers are raw FDs, it is important to be able to identify which processes have FD references to a DMA buffer. Currently, DMA buffer FDs can be accounted using /proc/<pid>/fd/* and /proc/<pid>/fdinfo -- both are only readable by the process owner, as follows: 1. Do a readlink on each FD. 2. If the target path begins with "/dmabuf", then the FD is a dmabuf FD. 3. stat the file to get the dmabuf inode number. 4. Read/ proc/<pid>/fdinfo/<fd>, to get the DMA buffer size. Accessing other processes' fdinfo requires root privileges. This limits the use of the interface to debugging environments and is not suitable for production builds. Granting root privileges even to a system process increases the attack surface and is highly undesirable. Since fdinfo doesn't permit reading process memory and manipulating process state, allow accessing fdinfo under PTRACE_MODE_READ_FSCRED. Link: https://lkml.kernel.org/r/20210308170651.919148-1-kaleshsingh@google.com Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Suggested-by: Jann Horn <jannh@google.com> Acked-by: Christian König <christian.koenig@amd.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Bernd Edlinger <bernd.edlinger@hotmail.de> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Helge Deller <deller@gmx.de> Cc: Hridya Valsaraju <hridya@google.com> Cc: James Morris <jamorris@linux.microsoft.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michel Lespinasse <walken@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Szabolcs Nagy <szabolcs.nagy@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 7bc3fa0172a423afb34e6df7a3998e5f23b1a94a) Bug: 159126739 Bug: 167141117 Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Change-Id: I842b689670f731138592f45c7124ef446d9aa59a	2021-07-12 22:38:16 +00:00
Kalesh Singh	2e0476a465	Revert "FROMLIST: procfs: Allow reading fdinfo with PTRACE_MODE_READ" Revert submission 1578844 Reason for revert: Will be replaced by upstream version Reverted Changes: Ic9c551998:FROMLIST: BACKPORT: procfs/dmabuf: Add inode numbe... I41407760c:FROMLIST: procfs: Allow reading fdinfo with PTRACE... Bug: 159126739 Bug: 167141117 Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Change-Id: Iede4e9a65f87dad1ca0c6ecdeb42de47af4b37c8	2021-07-12 22:38:09 +00:00
Kalesh Singh	5ded961aa2	Revert "FROMLIST: BACKPORT: procfs/dmabuf: Add inode number to /..." Revert submission 1578844 Reason for revert: Will be replaced by upstream version Reverted Changes: Ic9c551998:FROMLIST: BACKPORT: procfs/dmabuf: Add inode numbe... I41407760c:FROMLIST: procfs: Allow reading fdinfo with PTRACE... Bug: 159126739 Bug: 167141117 Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Change-Id: If02a6dc9a525193f286a138791de49085cd91972	2021-07-12 22:38:01 +00:00
Jaegeuk Kim	3ee5565017	UPSTREAM: f2fs: initialize page->private when using for our internal use We need to guarantee it's initially zero. Otherwise, it'll hurt entire flag operations. Bug: 192173168 Fixes: b763f3bedc2d ("f2fs: restructure f2fs page.private layout") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> (cherry picked from commit c9ebd3df43c0 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master) Change-Id: I02211d4bd2b1cb526e5fbcd55673043e3082d011	2021-07-12 21:01:34 +00:00
Kuan-Ying Lee	a7a3b31d58	ANDROID: syscall_check: add vendor hook for open syscall Through this vendor hook, we can get the timing to check current running task for the validation of its credential and open operation. Bug: 191291287 Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Change-Id: Ia644ceb02dbc230ee1d25cad3630c2c3f908e41a	2021-07-09 13:48:44 +00:00
Isaac J. Manjarres	bd2ca0ba5b	FROMLIST: pstore/ram: Rework logic for detecting ramoops reserved memory region The reserved memory region for ramoops is assumed to be at a fixed and known location when read from the devicetree. This is not desirable in environments where it is preferred for the region to be dynamically allocated at runtime, as opposed to it being fixed at compile time. Change the logic for detecting the start and size of the ramoops memory region by looking up the reserved memory region instead of using platform_get_resource(), which assumes that the location of the memory is known ahead of time. Bug: 191636717 Link: https://lore.kernel.org/patchwork/patch/1451704/ Change-Id: I24066de9f4fe1f1575cb1bbb1687c37a2b1938a4 Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Signed-off-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>	2021-07-08 18:16:31 +00:00
Jaegeuk Kim	8102df91f2	Merge remote-tracking branch 'aosp/upstream-f2fs-stable-linux-5.10.y' into android12-5.10 * aosp/upstream-f2fs-stable-linux-5.10.y: Revert "f2fs: avoid attaching SB_ACTIVE flag during mount/remount" f2fs: remove false alarm on iget failure during GC f2fs: enable extent cache for compression files in read-only f2fs: fix to avoid adding tab before doc section f2fs: introduce f2fs_casefolded_name slab cache f2fs: swap: support migrating swapfile in aligned write mode f2fs: swap: remove dead codes f2fs: compress: add compress_inode to cache compressed blocks f2fs: clean up /sys/fs/f2fs/<disk>/features f2fs: add pin_file in feature list f2fs: Advertise encrypted casefolding in sysfs f2fs: Show casefolding support only when supported f2fs: support RO feature f2fs: logging neatening Bug: 186107892 Bug: 190759634 Bug: 190517210 Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Change-Id: I8eb93d8a43304b98166676da52a9c2434b15b942	2021-06-23 19:54:41 -07:00
Jaegeuk Kim	bb51a33182	Revert "f2fs: avoid attaching SB_ACTIVE flag during mount/remount" This reverts commit `42bbf0bcc2`.	2021-06-23 01:42:05 -07:00
Jaegeuk Kim	c81ac64da1	f2fs: remove false alarm on iget failure during GC This patch removes setting SBI_NEED_FSCK when GC gets an error on f2fs_iget, since f2fs_iget can give ENOMEM and others by race condition. If we set this critical fsck flag, we'll get EIO during fsync via the below code path. In f2fs_inplace_write_data(), if (is_sbi_flag_set(sbi, SBI_NEED_FSCK) \|\| f2fs_cp_error(sbi)) { err = -EIO; goto drop_bio; } Fixes: 9557727876674 ("f2fs: drop inplace IO if fs status is abnormal") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2021-06-23 01:41:57 -07:00
Daeho Jeong	cdeff03989	f2fs: enable extent cache for compression files in read-only Let's allow extent cache for RO partition. Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2021-06-21 07:24:26 -07:00
Chao Yu	44e0be85eb	f2fs: introduce f2fs_casefolded_name slab cache Add a slab cache: "f2fs_casefolded_name" for memory allocation of casefold name. Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2021-06-21 07:24:24 -07:00
Greg Kroah-Hartman	9e08e97ec6	Merge 5.10.43 into android12-5.10 Changes in 5.10.43 btrfs: tree-checker: do not error out if extent ref hash doesn't match net: usb: cdc_ncm: don't spew notifications hwmon: (dell-smm-hwmon) Fix index values hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_3 for RAA228228 netfilter: conntrack: unregister ipv4 sockopts on error unwind efi/fdt: fix panic when no valid fdt found efi: Allow EFI_MEMORY_XP and EFI_MEMORY_RO both to be cleared efi/libstub: prevent read overflow in find_file_option() efi: cper: fix snprintf() use in cper_dimm_err_location() vfio/pci: Fix error return code in vfio_ecap_init() vfio/pci: zap_vma_ptes() needs MMU samples: vfio-mdev: fix error handing in mdpy_fb_probe() vfio/platform: fix module_put call in error flow ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service HID: logitech-hidpp: initialize level variable HID: pidff: fix error return code in hid_pidff_init() HID: i2c-hid: fix format string mismatch devlink: Correct VIRTUAL port to not have phys_port attributes net/sched: act_ct: Offload connections with commit action net/sched: act_ct: Fix ct template allocation for zone 0 mptcp: always parse mptcp options for MPC reqsk nvme-rdma: fix in-casule data send for chained sgls ACPICA: Clean up context mutex during object deletion perf probe: Fix NULL pointer dereference in convert_variable_location() net: dsa: tag_8021q: fix the VLAN IDs used for encoding sub-VLANs net: sock: fix in-kernel mark setting net/tls: Replace TLS_RX_SYNC_RUNNING with RCU net/tls: Fix use-after-free after the TLS device goes down and up net/mlx5e: Fix incompatible casting net/mlx5: Check firmware sync reset requested is set before trying to abort it net/mlx5e: Check for needed capability for cvlan matching net/mlx5: DR, Create multi-destination flow table with level less than 64 nvmet: fix freeing unallocated p2pmem netfilter: nft_ct: skip expectations for confirmed conntrack netfilter: nfnetlink_cthelper: hit EBUSY on updates if size mismatches drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest() bpf: Simplify cases in bpf_base_func_proto bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks ieee802154: fix error return code in ieee802154_add_iface() ieee802154: fix error return code in ieee802154_llsec_getparams() igb: add correct exception tracing for XDP ixgbevf: add correct exception tracing for XDP cxgb4: fix regression with HASH tc prio value update ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions ice: Fix allowing VF to request more/less queues via virtchnl ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared ice: handle the VF VSI rebuild failure ice: report supported and advertised autoneg using PHY capabilities ice: Allow all LLDP packets from PF to Tx i2c: qcom-geni: Add shutdown callback for i2c cxgb4: avoid link re-train during TC-MQPRIO configuration i40e: optimize for XDP_REDIRECT in xsk path i40e: add correct exception tracing for XDP ice: simplify ice_run_xdp ice: optimize for XDP_REDIRECT in xsk path ice: add correct exception tracing for XDP ixgbe: optimize for XDP_REDIRECT in xsk path ixgbe: add correct exception tracing for XDP arm64: dts: ti: j7200-main: Mark Main NAVSS as dma-coherent optee: use export_uuid() to copy client UUID bus: ti-sysc: Fix am335x resume hang for usb otg module arm64: dts: ls1028a: fix memory node arm64: dts: zii-ultra: fix 12V_MAIN voltage arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property ARM: dts: imx7d-pico: Fix the 'tuning-step' property ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act tipc: add extack messages for bearer/media failure tipc: fix unique bearer names sanity check serial: stm32: fix threaded interrupt handling riscv: vdso: fix and clean-up Makefile io_uring: fix link timeout refs io_uring: use better types for cflags drm/amdgpu/vcn3: add cancel_delayed_work_sync before power gate drm/amdgpu/jpeg2.5: add cancel_delayed_work_sync before power gate drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gate Bluetooth: fix the erroneous flush_work() order Bluetooth: use correct lock to prevent UAF of hdev object wireguard: do not use -O3 wireguard: peer: allocate in kmem_cache wireguard: use synchronize_net rather than synchronize_rcu wireguard: selftests: remove old conntrack kconfig value wireguard: selftests: make sure rp_filter is disabled on vethc wireguard: allowedips: initialize list head in selftest wireguard: allowedips: remove nodes in O(1) wireguard: allowedips: allocate nodes in kmem_cache wireguard: allowedips: free empty intermediate nodes when removing single node net: caif: added cfserl_release function net: caif: add proper error handling net: caif: fix memory leak in caif_device_notify net: caif: fix memory leak in cfusbl_device_notify HID: i2c-hid: Skip ELAN power-on command after reset HID: magicmouse: fix NULL-deref on disconnect HID: multitouch: require Finger field to mark Win8 reports as MT gfs2: fix scheduling while atomic bug in glocks ALSA: timer: Fix master timer notification ALSA: hda: Fix for mute key LED for HP Pavilion 15-CK0xx ALSA: hda: update the power_state during the direct-complete ARM: dts: imx6dl-yapp4: Fix RGMII connection to QCA8334 switch ARM: dts: imx6q-dhcom: Add PU,VDD1P1,VDD2P5 regulators ext4: fix memory leak in ext4_fill_super ext4: fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed ext4: fix fast commit alignment issues ext4: fix memory leak in ext4_mb_init_backend on error path. ext4: fix accessing uninit percpu counter variable with fast_commit usb: dwc2: Fix build in periphal-only mode pid: take a reference when initializing `cad_pid` ocfs2: fix data corruption by fallocate mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests() mm/page_alloc: fix counting of free pages after take off from buddy x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid() x86/sev: Check SME/SEV support in CPUID first nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect drm/amdgpu: Don't query CE and UE errors drm/amdgpu: make sure we unpin the UVD BO x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing powerpc/kprobes: Fix validation of prefixed instructions across page boundary btrfs: mark ordered extent and inode with error if we fail to finish btrfs: fix error handling in btrfs_del_csums btrfs: return errors from btrfs_del_csums in cleanup_ref_head btrfs: fixup error handling in fixup_inode_link_counts btrfs: abort in rename_exchange if we fail to insert the second ref btrfs: fix deadlock when cloning inline extents and low on available space mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY drm/msm/dpu: always use mdp device to scale bandwidth btrfs: fix unmountable seed device after fstrim KVM: SVM: Truncate GPR value for DR and CR accesses in !64-bit mode KVM: arm64: Fix debug register indexing x86/kvm: Teardown PV features on boot CPU as well x86/kvm: Disable kvmclock on all CPUs on shutdown x86/kvm: Disable all PV features on crash lib/lz4: explicitly support in-place decompression i2c: qcom-geni: Suspend and resume the bus during SYSTEM_SLEEP_PM ops netfilter: nf_tables: missing error reporting for not selected expressions xen-netback: take a reference to the RX task thread neighbour: allow NUD_NOARP entries to be forced GCed Linux 5.10.43 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I8d7ec0878193e4e454076809b7fb71fcc4e3d810	2021-06-12 14:48:14 +02:00
Anand Jain	fe910d20e2	btrfs: fix unmountable seed device after fstrim commit 5e753a817b2d5991dfe8a801b7b1e8e79a1c5a20 upstream. The following test case reproduces an issue of wrongly freeing in-use blocks on the readonly seed device when fstrim is called on the rw sprout device. As shown below. Create a seed device and add a sprout device to it: $ mkfs.btrfs -fq -dsingle -msingle /dev/loop0 $ btrfstune -S 1 /dev/loop0 $ mount /dev/loop0 /btrfs $ btrfs dev add -f /dev/loop1 /btrfs BTRFS info (device loop0): relocating block group 290455552 flags system BTRFS info (device loop0): relocating block group 1048576 flags system BTRFS info (device loop0): disk added /dev/loop1 $ umount /btrfs Mount the sprout device and run fstrim: $ mount /dev/loop1 /btrfs $ fstrim /btrfs $ umount /btrfs Now try to mount the seed device, and it fails: $ mount /dev/loop0 /btrfs mount: /btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. Block 5292032 is missing on the readonly seed device: $ dmesg -kt \| tail <snip> BTRFS error (device loop0): bad tree block start, want 5292032 have 0 BTRFS warning (device loop0): couldn't read-tree root BTRFS error (device loop0): open_ctree failed >From the dump-tree of the seed device (taken before the fstrim). Block 5292032 belonged to the block group starting at 5242880: $ btrfs inspect dump-tree -e /dev/loop0 \| grep -A1 BLOCK_GROUP <snip> item 3 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16169 itemsize 24 block group used 114688 chunk_objectid 256 flags METADATA <snip> >From the dump-tree of the sprout device (taken before the fstrim). fstrim used block-group 5242880 to find the related free space to free: $ btrfs inspect dump-tree -e /dev/loop1 \| grep -A1 BLOCK_GROUP <snip> item 1 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16226 itemsize 24 block group used 32768 chunk_objectid 256 flags METADATA <snip> BPF kernel tracing the fstrim command finds the missing block 5292032 within the range of the discarded blocks as below: kprobe:btrfs_discard_extent { printf("freeing start %llu end %llu num_bytes %llu:\n", arg1, arg1+arg2, arg2); } freeing start 5259264 end 5406720 num_bytes 147456 <snip> Fix this by avoiding the discard command to the readonly seed device. Reported-by: Chris Murphy <lists@colorremedies.com> CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-10 13:39:28 +02:00
Filipe Manana	baa6763123	btrfs: fix deadlock when cloning inline extents and low on available space commit 76a6d5cd74479e7ec8a7f9a29bce63d5549b6b2e upstream. There are a few cases where cloning an inline extent requires copying data into a page of the destination inode. For these cases we are allocating the required data and metadata space while holding a leaf locked. This can result in a deadlock when we are low on available space because allocating the space may flush delalloc and two deadlock scenarios can happen: 1) When starting writeback for an inode with a very small dirty range that fits in an inline extent, we deadlock during the writeback when trying to insert the inline extent, at cow_file_range_inline(), if the extent is going to be located in the leaf for which we are already holding a read lock; 2) After successfully starting writeback, for non-inline extent cases, the async reclaim thread will hang waiting for an ordered extent to complete if the ordered extent completion needs to modify the leaf for which the clone task is holding a read lock (for adding or replacing file extent items). So the cloning task will wait forever on the async reclaim thread to make progress, which in turn is waiting for the ordered extent completion which in turn is waiting to acquire a write lock on the same leaf. So fix this by making sure we release the path (and therefore the leaf) every time we need to copy the inline extent's data into a page of the destination inode, as by that time we do not need to have the leaf locked. Fixes: `05a5a7621c` ("Btrfs: implement full reflink support for inline extents") CC: stable@vger.kernel.org # 5.10+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-10 13:39:28 +02:00
Josef Bacik	0df50d47d1	btrfs: abort in rename_exchange if we fail to insert the second ref commit dc09ef3562726cd520c8338c1640872a60187af5 upstream. Error injection stress uncovered a problem where we'd leave a dangling inode ref if we failed during a rename_exchange. This happens because we insert the inode ref for one side of the rename, and then for the other side. If this second inode ref insert fails we'll leave the first one dangling and leave a corrupt file system behind. Fix this by aborting if we did the insert for the first inode ref. CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-10 13:39:28 +02:00
Josef Bacik	48568f3944	btrfs: fixup error handling in fixup_inode_link_counts commit 011b28acf940eb61c000059dd9e2cfcbf52ed96b upstream. This function has the following pattern while (1) { ret = whatever(); if (ret) goto out; } ret = 0 out: return ret; However several places in this while loop we simply break; when there's a problem, thus clearing the return value, and in one case we do a return -EIO, and leak the memory for the path. Fix this by re-arranging the loop to deal with ret == 1 coming from btrfs_search_slot, and then simply delete the ret = 0; out: bit so everybody can break if there is an error, which will allow for proper error handling to occur. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-10 13:39:28 +02:00

1 2 3 4 5 ...

68698 Commits