From e2a4080d0429343560249dcf84bb2bb535f752a7 Mon Sep 17 00:00:00 2001 From: Lee Jones Date: Fri, 25 Nov 2022 12:07:50 +0000 Subject: [PATCH 01/51] BACKPORT: Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [ Upstream commit 152fe65f300e1819d59b80477d3e0999b4d5d7d2 ] When enabled, KASAN enlarges function's stack-frames. Pushing quite a few over the current threshold. This can mainly be seen on 32-bit architectures where the present limit (when !GCC) is a lowly 1024-Bytes. Bug: 261962742 Link: https://lkml.kernel.org/r/20221125120750.3537134-3-lee@kernel.org Signed-off-by: Lee Jones Acked-by: Arnd Bergmann Cc: Alex Deucher Cc: "Christian König" Cc: Daniel Vetter Cc: David Airlie Cc: Harry Wentland Cc: Leo Li Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: "Pan, Xinhui" Cc: Rodrigo Siqueira Cc: Thomas Zimmermann Cc: Tom Rix Cc: Signed-off-by: Andrew Morton Signed-off-by: Sasha Levin Change-Id: I505a5187220b426fe49c0f15bf1704198082f63d Signed-off-by: Lee Jones --- lib/Kconfig.debug | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index d0740234b87a..c21eaff38d92 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -298,6 +298,7 @@ config FRAME_WARN int "Warn for stack frames larger than" range 0 8192 default 2048 if GCC_PLUGIN_LATENT_ENTROPY + default 1280 if KASAN && !64BIT default 1280 if (!64BIT && PARISC) default 1024 if (!64BIT && !PARISC) default 2048 if 64BIT From eddb2f39cd5396c9feacda79e2ceadbdf714c7a4 Mon Sep 17 00:00:00 2001 From: Lee Jones Date: Fri, 25 Nov 2022 12:07:49 +0000 Subject: [PATCH 02/51] UPSTREAM: drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit commit 6f6cb1714365a07dbc66851879538df9f6969288 upstream. Patch series "Fix a bunch of allmodconfig errors", v2. Since b339ec9c229aa ("kbuild: Only default to -Werror if COMPILE_TEST") WERROR now defaults to COMPILE_TEST meaning that it's enabled for allmodconfig builds. This leads to some interesting build failures when using Clang, each resolved in this set. With this set applied, I am able to obtain a successful allmodconfig Arm build. This patch (of 2): calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 || ARM64) architectures built with Clang (all released versions), whereby the stack frame gets blown up to well over 5k. This would cause an immediate kernel panic on most architectures. We'll revert this when the following bug report has been resolved: https://github.com/llvm/llvm-project/issues/41896. Bug: 261962742 Link: https://lkml.kernel.org/r/20221125120750.3537134-1-lee@kernel.org Link: https://lkml.kernel.org/r/20221125120750.3537134-2-lee@kernel.org Signed-off-by: Lee Jones Suggested-by: Arnd Bergmann Acked-by: Arnd Bergmann Cc: Alex Deucher Cc: "Christian König" Cc: Daniel Vetter Cc: David Airlie Cc: Harry Wentland Cc: Lee Jones Cc: Leo Li Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: "Pan, Xinhui" Cc: Rodrigo Siqueira Cc: Thomas Zimmermann Cc: Tom Rix Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman Signed-off-by: Lee Jones Change-Id: Iaa42b18cdcf9fe23d740c036371bd7950d431e14 Signed-off-by: Lee Jones --- drivers/gpu/drm/amd/display/Kconfig | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/display/Kconfig b/drivers/gpu/drm/amd/display/Kconfig index f3274eb6b341..6c4cba09d23b 100644 --- a/drivers/gpu/drm/amd/display/Kconfig +++ b/drivers/gpu/drm/amd/display/Kconfig @@ -5,6 +5,7 @@ menu "Display Engine Configuration" config DRM_AMD_DC bool "AMD DC - Enable new display engine" default y + depends on BROKEN || !CC_IS_CLANG || X86_64 || SPARC64 || ARM64 select SND_HDA_COMPONENT if SND_HDA_CORE select DRM_AMD_DC_DCN if (X86 || PPC64) && !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS) help @@ -12,6 +13,12 @@ config DRM_AMD_DC support for AMDGPU. This adds required support for Vega and Raven ASICs. + calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 || ARM64) + architectures built with Clang (all released versions), whereby the stack + frame gets blown up to well over 5k. This would cause an immediate kernel + panic on most architectures. We'll revert this when the following bug report + has been resolved: https://github.com/llvm/llvm-project/issues/41896. + config DRM_AMD_DC_DCN def_bool n help From ce18af9b5d7d0baad2ac3eea4c732d2bf128d690 Mon Sep 17 00:00:00 2001 From: Pavankumar Kondeti Date: Thu, 8 Dec 2022 16:16:37 +0530 Subject: [PATCH 03/51] ANDROID: dma-buf: don't re-purpose kobject as work_struct The commit 5aec776ef8c9 ("BACKPORT: ANDROID: dma-buf: Move sysfs work out of DMA-BUF export path) re-purposed kobject as work_struct temporarily to create the sysfs entries asynchronously. The author knows what he is doing and rightly added a build assert if kobject struct size is smaller than the work_struct size. We are hitting this build assert on a non-GKI platform where CONFIG_ANDROID_KABI_RESERVE is not set. Fix this problem by allocating a new union with dma_buf_sysfs_entry structure and temporary structure as members. We only end up allocating more memory (because of union) only when kobject size is smaller than work_struct which the original patch any way assumed would never be true. Bug: 261818147 Change-Id: Ifb089bf80d8a3a44ece9f05fc0b99ee76cb11645 Signed-off-by: Pavankumar Kondeti --- drivers/dma-buf/dma-buf-sysfs-stats.c | 44 +++++++++++++++------------ 1 file changed, 24 insertions(+), 20 deletions(-) diff --git a/drivers/dma-buf/dma-buf-sysfs-stats.c b/drivers/dma-buf/dma-buf-sysfs-stats.c index 7ae64cd3dbb8..5c8efa55e3aa 100644 --- a/drivers/dma-buf/dma-buf-sysfs-stats.c +++ b/drivers/dma-buf/dma-buf-sysfs-stats.c @@ -142,15 +142,21 @@ void dma_buf_uninit_sysfs_statistics(void) kset_unregister(dma_buf_stats_kset); } +struct dma_buf_create_sysfs_entry { + struct dma_buf *dmabuf; + struct work_struct work; +}; + +union dma_buf_create_sysfs_work_entry { + struct dma_buf_create_sysfs_entry create_entry; + struct dma_buf_sysfs_entry sysfs_entry; +}; + static void sysfs_add_workfn(struct work_struct *work) { - /* The ABI would have to change for this to be false, but let's be paranoid. */ - _Static_assert(sizeof(struct kobject) >= sizeof(struct work_struct), - "kobject is smaller than work_struct"); - - struct dma_buf_sysfs_entry *sysfs_entry = - container_of((struct kobject *)work, struct dma_buf_sysfs_entry, kobj); - struct dma_buf *dmabuf = sysfs_entry->dmabuf; + struct dma_buf_create_sysfs_entry *create_entry = + container_of(work, struct dma_buf_create_sysfs_entry, work); + struct dma_buf *dmabuf = create_entry->dmabuf; /* * A dmabuf is ref-counted via its file member. If this handler holds the only @@ -161,6 +167,7 @@ static void sysfs_add_workfn(struct work_struct *work) * is released, and that can't happen until the end of this function. */ if (file_count(dmabuf->file) > 1) { + dmabuf->sysfs_entry->dmabuf = dmabuf; /* * kobject_init_and_add expects kobject to be zero-filled, but we have populated it * to trigger this work function. @@ -185,8 +192,8 @@ static void sysfs_add_workfn(struct work_struct *work) int dma_buf_stats_setup(struct dma_buf *dmabuf) { - struct dma_buf_sysfs_entry *sysfs_entry; - struct work_struct *work; + struct dma_buf_create_sysfs_entry *create_entry; + union dma_buf_create_sysfs_work_entry *work_entry; if (!dmabuf || !dmabuf->file) return -EINVAL; @@ -196,21 +203,18 @@ int dma_buf_stats_setup(struct dma_buf *dmabuf) return -EINVAL; } - sysfs_entry = kmalloc(sizeof(struct dma_buf_sysfs_entry), GFP_KERNEL); - if (!sysfs_entry) + work_entry = kmalloc(sizeof(union dma_buf_create_sysfs_work_entry), GFP_KERNEL); + if (!work_entry) return -ENOMEM; - sysfs_entry->dmabuf = dmabuf; - dmabuf->sysfs_entry = sysfs_entry; + dmabuf->sysfs_entry = &work_entry->sysfs_entry; - /* - * The use of kobj as a work_struct is an ugly hack - * to avoid an ABI break in this frozen kernel. - */ - work = (struct work_struct *)&dmabuf->sysfs_entry->kobj; - INIT_WORK(work, sysfs_add_workfn); + create_entry = &work_entry->create_entry; + create_entry->dmabuf = dmabuf; + + INIT_WORK(&create_entry->work, sysfs_add_workfn); get_dma_buf(dmabuf); /* This reference will be dropped in sysfs_add_workfn. */ - schedule_work(work); + schedule_work(&create_entry->work); return 0; } From 8ad88eae4b6914c512569b10c19e04754def1746 Mon Sep 17 00:00:00 2001 From: Pavankumar Kondeti Date: Thu, 8 Dec 2022 15:05:08 +0530 Subject: [PATCH 04/51] ANDROID: dma-buf: Fix build breakage with !CONFIG_DMABUF_SYSFS_STATS The commit c5589c7eec41 ("ANDROID: dma-buf: Add vendor hook for deferred dmabuf sysfs stats release") introduced a build breakage on non-GKI targets which don't have CONFIG_DMABUF_SYSFS_STATS enabled. It is due to invisibility of struct dma_buf_sysfs_entry in the trace hook header file. We can get away with it by moving the header inclusion from trace hook header to vendor hooks driver. Bug: 261818075 Change-Id: Ibb79bd67c9f1b36fe2b5d569ab9369f376a78b77 Signed-off-by: Pavankumar Kondeti --- drivers/android/vendor_hooks.c | 1 + include/trace/hooks/dmabuf.h | 6 ------ 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/drivers/android/vendor_hooks.c b/drivers/android/vendor_hooks.c index 12066949e275..91288f59a390 100644 --- a/drivers/android/vendor_hooks.c +++ b/drivers/android/vendor_hooks.c @@ -7,6 +7,7 @@ */ #ifndef __GENKSYMS__ +#include #include #endif diff --git a/include/trace/hooks/dmabuf.h b/include/trace/hooks/dmabuf.h index 8963742273bc..0182960efca2 100644 --- a/include/trace/hooks/dmabuf.h +++ b/include/trace/hooks/dmabuf.h @@ -11,13 +11,7 @@ #include -#ifdef __GENKSYMS__ struct dma_buf_sysfs_entry; -#else -/* struct dma_buf_sysfs_entry */ -#include -#endif - DECLARE_RESTRICTED_HOOK(android_rvh_dma_buf_stats_teardown, TP_PROTO(struct dma_buf_sysfs_entry *sysfs_entry, bool *skip_sysfs_release), TP_ARGS(sysfs_entry, skip_sysfs_release), 1); From d37e563bff54a64d9cb2a7402951a733ba1d1e49 Mon Sep 17 00:00:00 2001 From: Dan Vacura Date: Thu, 8 Dec 2022 15:30:21 -0600 Subject: [PATCH 05/51] ANDROID: usb: gadget: uvc: remove duplicate code in unbind The uvc_function_unbind() was calling the same code two times, increasing a timeout that may occur. The duplicate code looks to have come in during the merge of 5.10.117. Remove the duplicate code. Bug: 261895714 Change-Id: I8957048bfad4a9e01baea033de9b628362b2d991 Signed-off-by: Dan Vacura --- drivers/usb/gadget/function/f_uvc.c | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c index 607ea0c1bfb3..1fc00cce83fb 100644 --- a/drivers/usb/gadget/function/f_uvc.c +++ b/drivers/usb/gadget/function/f_uvc.c @@ -906,18 +906,6 @@ static void uvc_function_unbind(struct usb_configuration *c, uvcg_dbg(f, "done waiting with ret: %ld\n", wait_ret); } - /* If we know we're connected via v4l2, then there should be a cleanup - * of the device from userspace either via UVC_EVENT_DISCONNECT or - * though the video device removal uevent. Allow some time for the - * application to close out before things get deleted. - */ - if (uvc->func_connected) { - uvcg_dbg(f, "waiting for clean disconnect\n"); - wait_ret = wait_event_interruptible_timeout(uvc->func_connected_queue, - uvc->func_connected == false, msecs_to_jiffies(500)); - uvcg_dbg(f, "done waiting with ret: %ld\n", wait_ret); - } - device_remove_file(&uvc->vdev.dev, &dev_attr_function_name); video_unregister_device(&uvc->vdev); v4l2_device_unregister(&uvc->v4l2_dev); From 65654da06db8f6a7e88151240d354b233e086871 Mon Sep 17 00:00:00 2001 From: Pavel Machek Date: Fri, 20 Aug 2021 10:26:24 +0200 Subject: [PATCH 06/51] UPSTREAM: Documentation: leds: standartizing LED names We have a list of valid functions, but LED names in sysfs are still far from being consistent. Create list of "well known" LED names so we nudge people towards using same LED names (except color) for same functionality. Signed-off-by: Pavel Machek Bug: 260685629 (cherry picked from commit 09f1273064eea23ec41fb206f6eccc2bf79d1fa1) Change-Id: Iea12a9c230d6cd072b0f4fd4e0c616348173dd53 Signed-off-by: Farid Chahla --- Documentation/leds/well-known-leds.txt | 58 ++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) create mode 100644 Documentation/leds/well-known-leds.txt diff --git a/Documentation/leds/well-known-leds.txt b/Documentation/leds/well-known-leds.txt new file mode 100644 index 000000000000..4a8b9dc4bf52 --- /dev/null +++ b/Documentation/leds/well-known-leds.txt @@ -0,0 +1,58 @@ +-*- org -*- + +It is somehow important to provide consistent interface to the +userland. LED devices have one problem there, and that is naming of +directories in /sys/class/leds. It would be nice if userland would +just know right "name" for given LED function, but situation got more +complex. + +Anyway, if backwards compatibility is not an issue, new code should +use one of the "good" names from this list, and you should extend the +list where applicable. + +Legacy names are listed, too; in case you are writing application that +wants to use particular feature, you should probe for good name, first, +but then try the legacy ones, too. + +Notice there's a list of functions in include/dt-bindings/leds/common.h . + +* Keyboards + +Good: "input*:*:capslock" +Good: "input*:*:scrolllock" +Good: "input*:*:numlock" +Legacy: "shift-key-light" (Motorola Droid 4, capslock) + +Set of common keyboard LEDs, going back to PC AT or so. + +Legacy: "tpacpi::thinklight" (IBM/Lenovo Thinkpads) +Legacy: "lp5523:kb{1,2,3,4,5,6}" (Nokia N900) + +Frontlight/backlight of main keyboard. + +Legacy: "button-backlight" (Motorola Droid 4) + +Some phones have touch buttons below screen; it is different from main +keyboard. And this is their backlight. + +* Sound subsystem + +Good: "platform:*:mute" +Good: "platform:*:micmute" + +LEDs on notebook body, indicating that sound input / output is muted. + +* System notification + +Legacy: "status-led:{red,green,blue}" (Motorola Droid 4) +Legacy: "lp5523:{r,g,b}" (Nokia N900) + +Phones usually have multi-color status LED. + +* Power management + +Good: "platform:*:charging" (allwinner sun50i) + +* Screen + +Good: ":backlight" (Motorola Droid 4) From e1cd3ffe478871deb44f1ce2fbe6a6c8e7e08ab3 Mon Sep 17 00:00:00 2001 From: Roderick Colenbrander Date: Wed, 8 Sep 2021 09:55:37 -0700 Subject: [PATCH 07/51] UPSTREAM: HID: playstation: expose DualSense lightbar through a multi-color LED. The DualSense lightbar has so far been supported, but it was not yet adjustable from user space. This patch exposes it through a multi-color LED. Signed-off-by: Roderick Colenbrander Signed-off-by: Jiri Kosina Bug: 260685629 (cherry picked from commit fc97b4d6a1a6d418fd4053fd7716eca746fdd163) Change-Id: I48204113da804b13ad5bed2f651a5826ab5a86f7 Signed-off-by: Farid Chahla --- drivers/hid/hid-playstation.c | 72 +++++++++++++++++++++++++++++++++++ 1 file changed, 72 insertions(+) diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index ab7c82c2e886..ff2fc315a89d 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -11,6 +11,8 @@ #include #include #include +#include +#include #include #include @@ -38,6 +40,7 @@ struct ps_device { uint8_t battery_capacity; int battery_status; + const char *input_dev_name; /* Name of primary input device. */ uint8_t mac_address[6]; /* Note: stored in little endian order. */ uint32_t hw_version; uint32_t fw_version; @@ -147,6 +150,7 @@ struct dualsense { uint8_t motor_right; /* RGB lightbar */ + struct led_classdev_mc lightbar; bool update_lightbar; uint8_t lightbar_red; uint8_t lightbar_green; @@ -288,6 +292,8 @@ static const struct {int x; int y; } ps_gamepad_hat_mapping[] = { {0, 0}, }; +static void dualsense_set_lightbar(struct dualsense *ds, uint8_t red, uint8_t green, uint8_t blue); + /* * Add a new ps_device to ps_devices if it doesn't exist. * Return error on duplicate device, which can happen if the same @@ -525,6 +531,45 @@ static int ps_get_report(struct hid_device *hdev, uint8_t report_id, uint8_t *bu return 0; } +/* Register a DualSense/DualShock4 RGB lightbar represented by a multicolor LED. */ +static int ps_lightbar_register(struct ps_device *ps_dev, struct led_classdev_mc *lightbar_mc_dev, + int (*brightness_set)(struct led_classdev *, enum led_brightness)) +{ + struct hid_device *hdev = ps_dev->hdev; + struct mc_subled *mc_led_info; + struct led_classdev *led_cdev; + int ret; + + mc_led_info = devm_kmalloc_array(&hdev->dev, 3, sizeof(*mc_led_info), + GFP_KERNEL | __GFP_ZERO); + if (!mc_led_info) + return -ENOMEM; + + mc_led_info[0].color_index = LED_COLOR_ID_RED; + mc_led_info[1].color_index = LED_COLOR_ID_GREEN; + mc_led_info[2].color_index = LED_COLOR_ID_BLUE; + + lightbar_mc_dev->subled_info = mc_led_info; + lightbar_mc_dev->num_colors = 3; + + led_cdev = &lightbar_mc_dev->led_cdev; + led_cdev->name = devm_kasprintf(&hdev->dev, GFP_KERNEL, "%s:rgb:indicator", + ps_dev->input_dev_name); + if (!led_cdev->name) + return -ENOMEM; + led_cdev->brightness = 255; + led_cdev->max_brightness = 255; + led_cdev->brightness_set_blocking = brightness_set; + + ret = devm_led_classdev_multicolor_register(&hdev->dev, lightbar_mc_dev); + if (ret < 0) { + hid_err(hdev, "Cannot register multicolor LED device\n"); + return ret; + } + + return 0; +} + static struct input_dev *ps_sensors_create(struct hid_device *hdev, int accel_range, int accel_res, int gyro_range, int gyro_res) { @@ -761,6 +806,22 @@ err_free: return ret; } +static int dualsense_lightbar_set_brightness(struct led_classdev *cdev, + enum led_brightness brightness) +{ + struct led_classdev_mc *mc_cdev = lcdev_to_mccdev(cdev); + struct dualsense *ds = container_of(mc_cdev, struct dualsense, lightbar); + uint8_t red, green, blue; + + led_mc_calc_color_components(mc_cdev, brightness); + red = mc_cdev->subled_info[0].brightness; + green = mc_cdev->subled_info[1].brightness; + blue = mc_cdev->subled_info[2].brightness; + + dualsense_set_lightbar(ds, red, green, blue); + return 0; +} + static void dualsense_init_output_report(struct dualsense *ds, struct dualsense_output_report *rp, void *buf) { @@ -1106,10 +1167,14 @@ static int dualsense_reset_leds(struct dualsense *ds) static void dualsense_set_lightbar(struct dualsense *ds, uint8_t red, uint8_t green, uint8_t blue) { + unsigned long flags; + + spin_lock_irqsave(&ds->base.lock, flags); ds->update_lightbar = true; ds->lightbar_red = red; ds->lightbar_green = green; ds->lightbar_blue = blue; + spin_unlock_irqrestore(&ds->base.lock, flags); schedule_work(&ds->output_worker); } @@ -1196,6 +1261,8 @@ static struct ps_device *dualsense_create(struct hid_device *hdev) ret = PTR_ERR(ds->gamepad); goto err; } + /* Use gamepad input device name as primary device name for e.g. LEDs */ + ps_dev->input_dev_name = dev_name(&ds->gamepad->dev); ds->sensors = ps_sensors_create(hdev, DS_ACC_RANGE, DS_ACC_RES_PER_G, DS_GYRO_RANGE, DS_GYRO_RES_PER_DEG_S); @@ -1223,6 +1290,11 @@ static struct ps_device *dualsense_create(struct hid_device *hdev) if (ret) goto err; + ret = ps_lightbar_register(ps_dev, &ds->lightbar, dualsense_lightbar_set_brightness); + if (ret) + goto err; + + /* Set default lightbar color. */ dualsense_set_lightbar(ds, 0, 0, 128); /* blue */ ret = ps_device_set_player_id(ps_dev); From a70e598cef39bf4c2df855832e3d5e121b57341d Mon Sep 17 00:00:00 2001 From: Roderick Colenbrander Date: Wed, 8 Sep 2021 09:55:38 -0700 Subject: [PATCH 08/51] UPSTREAM: leds: add new LED_FUNCTION_PLAYER for player LEDs for game controllers. Player LEDs are commonly found on game controllers from Nintendo and Sony to indicate a player ID across a number of LEDs. For example, "Player 2" might be indicated as "-x--" on a device with 4 LEDs where "x" means on. This patch introduces LED_FUNCTION_PLAYER1-5 defines to properly indicate player LEDs from the kernel. Until now there was no good standard, which resulted in inconsistent behavior across xpad, hid-sony, hid-wiimote and other drivers. Moving forward new drivers should use LED_FUNCTION_PLAYERx. Note: management of Player IDs is left to user space, though a kernel driver may pick a default value. Signed-off-by: Roderick Colenbrander Acked-by: Pavel Machek Signed-off-by: Jiri Kosina Bug: 260685629 (cherry picked from commit 61177c088a57bed259122f3c7bc6d61984936a12) Change-Id: Ie1de4d66304bb25fc2c9fcdb1ec9b7589ad9e7ac Signed-off-by: Farid Chahla --- Documentation/leds/well-known-leds.txt | 14 ++++++++++++++ include/dt-bindings/leds/common.h | 7 +++++++ 2 files changed, 21 insertions(+) diff --git a/Documentation/leds/well-known-leds.txt b/Documentation/leds/well-known-leds.txt index 4a8b9dc4bf52..2160382c86be 100644 --- a/Documentation/leds/well-known-leds.txt +++ b/Documentation/leds/well-known-leds.txt @@ -16,6 +16,20 @@ but then try the legacy ones, too. Notice there's a list of functions in include/dt-bindings/leds/common.h . +* Gamepads and joysticks + +Game controllers may feature LEDs to indicate a player number. This is commonly +used on game consoles in which multiple controllers can be connected to a system. +The "player LEDs" are then programmed with a pattern to indicate a particular +player. For example, a game controller with 4 LEDs, may be programmed with "x---" +to indicate player 1, "-x--" to indicate player 2 etcetera where "x" means on. +Input drivers can utilize the LED class to expose the individual player LEDs +of a game controller using the function "player". +Note: tracking and management of Player IDs is the responsibility of user space, +though drivers may pick a default value. + +Good: "input*:*:player-{1,2,3,4,5} + * Keyboards Good: "input*:*:capslock" diff --git a/include/dt-bindings/leds/common.h b/include/dt-bindings/leds/common.h index 52b619d44ba2..3be89a7c20a9 100644 --- a/include/dt-bindings/leds/common.h +++ b/include/dt-bindings/leds/common.h @@ -60,6 +60,13 @@ #define LED_FUNCTION_MICMUTE "micmute" #define LED_FUNCTION_MUTE "mute" +/* Used for player LEDs as found on game controllers from e.g. Nintendo, Sony. */ +#define LED_FUNCTION_PLAYER1 "player-1" +#define LED_FUNCTION_PLAYER2 "player-2" +#define LED_FUNCTION_PLAYER3 "player-3" +#define LED_FUNCTION_PLAYER4 "player-4" +#define LED_FUNCTION_PLAYER5 "player-5" + /* Miscelleaus functions. Use functions above if you can. */ #define LED_FUNCTION_ACTIVITY "activity" #define LED_FUNCTION_ALARM "alarm" From f7901b46a2a79bcf10e46384ca32f871f8b3ea01 Mon Sep 17 00:00:00 2001 From: Roderick Colenbrander Date: Wed, 8 Sep 2021 09:55:39 -0700 Subject: [PATCH 09/51] UPSTREAM: HID: playstation: expose DualSense player LEDs through LED class. The DualSense player LEDs were so far not adjustable from user-space. This patch exposes each LED individually through the LED class. Each LED uses the new 'player' function resulting in a name like: 'inputX:white:player-1' for the first LED. Signed-off-by: Roderick Colenbrander Signed-off-by: Jiri Kosina Bug: 260685629 (cherry picked from commit 8c0ab553b072025530308f74b2c0223ec50dffe5) Change-Id: I49c699a99b0b8a7bb7980560e3ea7a12faf646aa Signed-off-by: Farid Chahla --- drivers/hid/hid-playstation.c | 85 ++++++++++++++++++++++++++++++++++- 1 file changed, 84 insertions(+), 1 deletion(-) diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index ff2fc315a89d..5cdfa71d1563 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -56,6 +56,13 @@ struct ps_calibration_data { int sens_denom; }; +struct ps_led_info { + const char *name; + const char *color; + enum led_brightness (*brightness_get)(struct led_classdev *cdev); + int (*brightness_set)(struct led_classdev *cdev, enum led_brightness); +}; + /* Seed values for DualShock4 / DualSense CRC32 for different report types. */ #define PS_INPUT_CRC32_SEED 0xA1 #define PS_OUTPUT_CRC32_SEED 0xA2 @@ -531,6 +538,32 @@ static int ps_get_report(struct hid_device *hdev, uint8_t report_id, uint8_t *bu return 0; } +static int ps_led_register(struct ps_device *ps_dev, struct led_classdev *led, + const struct ps_led_info *led_info) +{ + int ret; + + led->name = devm_kasprintf(&ps_dev->hdev->dev, GFP_KERNEL, + "%s:%s:%s", ps_dev->input_dev_name, led_info->color, led_info->name); + + if (!led->name) + return -ENOMEM; + + led->brightness = 0; + led->max_brightness = 1; + led->flags = LED_CORE_SUSPENDRESUME; + led->brightness_get = led_info->brightness_get; + led->brightness_set_blocking = led_info->brightness_set; + + ret = devm_led_classdev_register(&ps_dev->hdev->dev, led); + if (ret) { + hid_err(ps_dev->hdev, "Failed to register LED %s: %d\n", led_info->name, ret); + return ret; + } + + return 0; +} + /* Register a DualSense/DualShock4 RGB lightbar represented by a multicolor LED. */ static int ps_lightbar_register(struct ps_device *ps_dev, struct led_classdev_mc *lightbar_mc_dev, int (*brightness_set)(struct led_classdev *, enum led_brightness)) @@ -822,6 +855,35 @@ static int dualsense_lightbar_set_brightness(struct led_classdev *cdev, return 0; } +static enum led_brightness dualsense_player_led_get_brightness(struct led_classdev *led) +{ + struct hid_device *hdev = to_hid_device(led->dev->parent); + struct dualsense *ds = hid_get_drvdata(hdev); + + return !!(ds->player_leds_state & BIT(led - ds->player_leds)); +} + +static int dualsense_player_led_set_brightness(struct led_classdev *led, enum led_brightness value) +{ + struct hid_device *hdev = to_hid_device(led->dev->parent); + struct dualsense *ds = hid_get_drvdata(hdev); + unsigned long flags; + unsigned int led_index; + + spin_lock_irqsave(&ds->base.lock, flags); + + led_index = led - ds->player_leds; + if (value == LED_OFF) + ds->player_leds_state &= ~BIT(led_index); + else + ds->player_leds_state |= BIT(led_index); + + ds->update_player_leds = true; + spin_unlock_irqrestore(&ds->base.lock, flags); + + schedule_work(&ds->output_worker); +} + static void dualsense_init_output_report(struct dualsense *ds, struct dualsense_output_report *rp, void *buf) { @@ -1207,7 +1269,20 @@ static struct ps_device *dualsense_create(struct hid_device *hdev) struct dualsense *ds; struct ps_device *ps_dev; uint8_t max_output_report_size; - int ret; + int i, ret; + + static const struct ps_led_info player_leds_info[] = { + { LED_FUNCTION_PLAYER1, "white", dualsense_player_led_get_brightness, + dualsense_player_led_set_brightness }, + { LED_FUNCTION_PLAYER2, "white", dualsense_player_led_get_brightness, + dualsense_player_led_set_brightness }, + { LED_FUNCTION_PLAYER3, "white", dualsense_player_led_get_brightness, + dualsense_player_led_set_brightness }, + { LED_FUNCTION_PLAYER4, "white", dualsense_player_led_get_brightness, + dualsense_player_led_set_brightness }, + { LED_FUNCTION_PLAYER5, "white", dualsense_player_led_get_brightness, + dualsense_player_led_set_brightness } + }; ds = devm_kzalloc(&hdev->dev, sizeof(*ds), GFP_KERNEL); if (!ds) @@ -1297,6 +1372,14 @@ static struct ps_device *dualsense_create(struct hid_device *hdev) /* Set default lightbar color. */ dualsense_set_lightbar(ds, 0, 0, 128); /* blue */ + for (i = 0; i < ARRAY_SIZE(player_leds_info); i++) { + const struct ps_led_info *led_info = &player_leds_info[i]; + + ret = ps_led_register(ps_dev, &ds->player_leds[i], led_info); + if (ret < 0) + goto err; + } + ret = ps_device_set_player_id(ps_dev); if (ret) { hid_err(hdev, "Failed to assign player id for DualSense: %d\n", ret); From 62964653b74c78ca511bba427fa3cd7268b542a3 Mon Sep 17 00:00:00 2001 From: Jiri Kosina Date: Wed, 27 Oct 2021 10:04:10 +0200 Subject: [PATCH 10/51] UPSTREAM: HID: playstation: fix return from dualsense_player_led_set_brightness() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit brightness_set_blocking() callback expects function returning int. This fixes the follwoing build failure: drivers/hid/hid-playstation.c: In function ‘dualsense_player_led_set_brightness’: drivers/hid/hid-playstation.c:885:1: error: no return statement in function returning non-void [-Werror=return-type] } ^ Signed-off-by: Jiri Kosina Bug: 260685629 (cherry picked from commit 3c92cb4cb60c71b574e47108ead8b6f0470850db) Change-Id: Id16b960826a26ac22c1a14572444f9af29689ed6 Signed-off-by: Farid Chahla --- drivers/hid/hid-playstation.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index 5cdfa71d1563..b1b5721b5d8f 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -882,6 +882,8 @@ static int dualsense_player_led_set_brightness(struct led_classdev *led, enum le spin_unlock_irqrestore(&ds->base.lock, flags); schedule_work(&ds->output_worker); + + return 0; } static void dualsense_init_output_report(struct dualsense *ds, struct dualsense_output_report *rp, From a301358cb5eac73b8857d50ab977e46625fd29c8 Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Thu, 4 Aug 2022 13:30:52 +0200 Subject: [PATCH 11/51] UPSTREAM: HID: playstation: convert to use dev_groups There is no need for a driver to individually add/create device groups, the driver core will do it automatically for you. Convert the hid-playstation driver to use the dev_groups pointer instead of manually calling the driver core to create the group and have it be cleaned up later on by the devm core. Cc: Roderick Colenbrander Cc: Jiri Kosina Cc: Benjamin Tissoires Cc: linux-input@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman Acked-by: Roderick Colenbrander Signed-off-by: Jiri Kosina Bug: 260685629 (cherry picked from commit b4a9af9be628e4f9d09997e0bdef30f6718e88ec) Change-Id: I516a1b0ef7f4f8545e0c1b9485b49879dd7a3136 Signed-off-by: Farid Chahla --- drivers/hid/hid-playstation.c | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index b1b5721b5d8f..40050eb85c0a 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -692,15 +692,12 @@ static ssize_t hardware_version_show(struct device *dev, static DEVICE_ATTR_RO(hardware_version); -static struct attribute *ps_device_attributes[] = { +static struct attribute *ps_device_attrs[] = { &dev_attr_firmware_version.attr, &dev_attr_hardware_version.attr, NULL }; - -static const struct attribute_group ps_device_attribute_group = { - .attrs = ps_device_attributes, -}; +ATTRIBUTE_GROUPS(ps_device); static int dualsense_get_calibration_data(struct dualsense *ds) { @@ -1448,12 +1445,6 @@ static int ps_probe(struct hid_device *hdev, const struct hid_device_id *id) } } - ret = devm_device_add_group(&hdev->dev, &ps_device_attribute_group); - if (ret) { - hid_err(hdev, "Failed to register sysfs nodes.\n"); - goto err_close; - } - return ret; err_close: @@ -1487,6 +1478,9 @@ static struct hid_driver ps_driver = { .probe = ps_probe, .remove = ps_remove, .raw_event = ps_raw_event, + .driver = { + .dev_groups = ps_device_groups, + }, }; static int __init ps_init(void) From a3ea8fbc1fa4bc77c45bed8e65fb052b76fab1b0 Mon Sep 17 00:00:00 2001 From: Roderick Colenbrander Date: Mon, 10 Oct 2022 14:23:11 -0700 Subject: [PATCH 12/51] UPSTREAM: HID: playstation: stop DualSense output work on remove. Ensure we don't schedule any new output work on removal and wait for any existing work to complete. If we don't do this e.g. rumble work can get queued during deletion and we trigger a kernel crash. Signed-off-by: Roderick Colenbrander CC: stable@vger.kernel.org Signed-off-by: Benjamin Tissoires Link: https://lore.kernel.org/r/20221010212313.78275-2-roderick.colenbrander@sony.com Bug: 260685629 (cherry picked from commit 182934a1e93b17f4edf71f4fcc8d19b19a6fe67a) Change-Id: I40cadfde5765cdabf45def929860258d6019bf10 Signed-off-by: Farid Chahla --- drivers/hid/hid-playstation.c | 41 ++++++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index 40050eb85c0a..d727cd2bf44e 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -46,6 +46,7 @@ struct ps_device { uint32_t fw_version; int (*parse_report)(struct ps_device *dev, struct hid_report *report, u8 *data, int size); + void (*remove)(struct ps_device *dev); }; /* Calibration data for playstation motion sensors. */ @@ -174,6 +175,7 @@ struct dualsense { struct led_classdev player_leds[5]; struct work_struct output_worker; + bool output_worker_initialized; void *output_report_dmabuf; uint8_t output_seq; /* Sequence number for output report. */ }; @@ -299,6 +301,7 @@ static const struct {int x; int y; } ps_gamepad_hat_mapping[] = { {0, 0}, }; +static inline void dualsense_schedule_work(struct dualsense *ds); static void dualsense_set_lightbar(struct dualsense *ds, uint8_t red, uint8_t green, uint8_t blue); /* @@ -789,6 +792,7 @@ err_free: return ret; } + static int dualsense_get_firmware_info(struct dualsense *ds) { uint8_t *buf; @@ -878,7 +882,7 @@ static int dualsense_player_led_set_brightness(struct led_classdev *led, enum le ds->update_player_leds = true; spin_unlock_irqrestore(&ds->base.lock, flags); - schedule_work(&ds->output_worker); + dualsense_schedule_work(ds); return 0; } @@ -922,6 +926,16 @@ static void dualsense_init_output_report(struct dualsense *ds, struct dualsense_ } } +static inline void dualsense_schedule_work(struct dualsense *ds) +{ + unsigned long flags; + + spin_lock_irqsave(&ds->base.lock, flags); + if (ds->output_worker_initialized) + schedule_work(&ds->output_worker); + spin_unlock_irqrestore(&ds->base.lock, flags); +} + /* * Helper function to send DualSense output reports. Applies a CRC at the end of a report * for Bluetooth reports. @@ -1082,7 +1096,7 @@ static int dualsense_parse_report(struct ps_device *ps_dev, struct hid_report *r spin_unlock_irqrestore(&ps_dev->lock, flags); /* Schedule updating of microphone state at hardware level. */ - schedule_work(&ds->output_worker); + dualsense_schedule_work(ds); } ds->last_btn_mic_state = btn_mic_state; @@ -1197,10 +1211,22 @@ static int dualsense_play_effect(struct input_dev *dev, void *data, struct ff_ef ds->motor_right = effect->u.rumble.weak_magnitude / 256; spin_unlock_irqrestore(&ds->base.lock, flags); - schedule_work(&ds->output_worker); + dualsense_schedule_work(ds); return 0; } +static void dualsense_remove(struct ps_device *ps_dev) +{ + struct dualsense *ds = container_of(ps_dev, struct dualsense, base); + unsigned long flags; + + spin_lock_irqsave(&ds->base.lock, flags); + ds->output_worker_initialized = false; + spin_unlock_irqrestore(&ds->base.lock, flags); + + cancel_work_sync(&ds->output_worker); +} + static int dualsense_reset_leds(struct dualsense *ds) { struct dualsense_output_report report; @@ -1237,7 +1263,7 @@ static void dualsense_set_lightbar(struct dualsense *ds, uint8_t red, uint8_t gr ds->lightbar_blue = blue; spin_unlock_irqrestore(&ds->base.lock, flags); - schedule_work(&ds->output_worker); + dualsense_schedule_work(ds); } static void dualsense_set_player_leds(struct dualsense *ds) @@ -1260,7 +1286,7 @@ static void dualsense_set_player_leds(struct dualsense *ds) ds->update_player_leds = true; ds->player_leds_state = player_ids[player_id]; - schedule_work(&ds->output_worker); + dualsense_schedule_work(ds); } static struct ps_device *dualsense_create(struct hid_device *hdev) @@ -1299,7 +1325,9 @@ static struct ps_device *dualsense_create(struct hid_device *hdev) ps_dev->battery_capacity = 100; /* initial value until parse_report. */ ps_dev->battery_status = POWER_SUPPLY_STATUS_UNKNOWN; ps_dev->parse_report = dualsense_parse_report; + ps_dev->remove = dualsense_remove; INIT_WORK(&ds->output_worker, dualsense_output_worker); + ds->output_worker_initialized = true; hid_set_drvdata(hdev, ds); max_output_report_size = sizeof(struct dualsense_output_report_bt); @@ -1461,6 +1489,9 @@ static void ps_remove(struct hid_device *hdev) ps_devices_list_remove(dev); ps_device_release_player_id(dev); + if (dev->remove) + dev->remove(dev); + hid_hw_close(hdev); hid_hw_stop(hdev); } From 63b2567f9de8f1b3b2b33a6af647e571e72a96a1 Mon Sep 17 00:00:00 2001 From: Roderick Colenbrander Date: Mon, 10 Oct 2022 14:23:12 -0700 Subject: [PATCH 13/51] UPSTREAM: HID: playstation: add initial DualSense Edge controller support Provide initial support for the DualSense Edge controller. The brings support up to the level of the original DualSense, but won't yet provide support for new features (e.g. reprogrammable buttons). Signed-off-by: Roderick Colenbrander CC: stable@vger.kernel.org Signed-off-by: Benjamin Tissoires Link: https://lore.kernel.org/r/20221010212313.78275-3-roderick.colenbrander@sony.com Bug: 260685629 (cherry picked from commit b8a968efab301743fd659b5649c5d7d3e30e63a6) Change-Id: I5b95de806e823085d1144f016d8cfd76e4a933ef Signed-off-by: Farid Chahla --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-playstation.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 07ee87f828e4..6be1658f7f8b 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -1087,6 +1087,7 @@ #define USB_DEVICE_ID_SONY_PS4_CONTROLLER_2 0x09cc #define USB_DEVICE_ID_SONY_PS4_CONTROLLER_DONGLE 0x0ba0 #define USB_DEVICE_ID_SONY_PS5_CONTROLLER 0x0ce6 +#define USB_DEVICE_ID_SONY_PS5_CONTROLLER_2 0x0df2 #define USB_DEVICE_ID_SONY_MOTION_CONTROLLER 0x03d5 #define USB_DEVICE_ID_SONY_NAVIGATION_CONTROLLER 0x042f #define USB_DEVICE_ID_SONY_BUZZ_CONTROLLER 0x0002 diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index d727cd2bf44e..396356b6760a 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -1464,7 +1464,8 @@ static int ps_probe(struct hid_device *hdev, const struct hid_device_id *id) goto err_stop; } - if (hdev->product == USB_DEVICE_ID_SONY_PS5_CONTROLLER) { + if (hdev->product == USB_DEVICE_ID_SONY_PS5_CONTROLLER || + hdev->product == USB_DEVICE_ID_SONY_PS5_CONTROLLER_2) { dev = dualsense_create(hdev); if (IS_ERR(dev)) { hid_err(hdev, "Failed to create dualsense.\n"); @@ -1499,6 +1500,8 @@ static void ps_remove(struct hid_device *hdev) static const struct hid_device_id ps_devices[] = { { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS5_CONTROLLER) }, { HID_USB_DEVICE(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS5_CONTROLLER) }, + { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS5_CONTROLLER_2) }, + { HID_USB_DEVICE(USB_VENDOR_ID_SONY, USB_DEVICE_ID_SONY_PS5_CONTROLLER_2) }, { } }; MODULE_DEVICE_TABLE(hid, ps_devices); From 4aa3cab588aa55d4e847755505d039c4ce2b056e Mon Sep 17 00:00:00 2001 From: Roderick Colenbrander Date: Mon, 10 Oct 2022 14:23:13 -0700 Subject: [PATCH 14/51] UPSTREAM: HID: playstation: support updated DualSense rumble mode. Newer DualSense firmware supports a revised classic rumble mode, which feels more similar to rumble as supported on previous PlayStation controllers. It has been made the default on PlayStation and non-PlayStation devices now (e.g. iOS and Windows). Default to this new mode when supported. Signed-off-by: Roderick Colenbrander Signed-off-by: Benjamin Tissoires Link: https://lore.kernel.org/r/20221010212313.78275-4-roderick.colenbrander@sony.com Bug: 260685629 (cherry picked from commit 9fecab247ed15e6145c126fc56ee1e89860741a7) Change-Id: Icd330111a4d1b1e76a04cd11c623d0982ce3d66f Signed-off-by: Farid Chahla --- drivers/hid/hid-playstation.c | 37 ++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/drivers/hid/hid-playstation.c b/drivers/hid/hid-playstation.c index 396356b6760a..0b58763bfd30 100644 --- a/drivers/hid/hid-playstation.c +++ b/drivers/hid/hid-playstation.c @@ -108,6 +108,9 @@ struct ps_led_info { #define DS_STATUS_CHARGING GENMASK(7, 4) #define DS_STATUS_CHARGING_SHIFT 4 +/* Feature version from DualSense Firmware Info report. */ +#define DS_FEATURE_VERSION(major, minor) ((major & 0xff) << 8 | (minor & 0xff)) + /* * Status of a DualSense touch point contact. * Contact IDs, with highest bit set are 'inactive' @@ -126,6 +129,7 @@ struct ps_led_info { #define DS_OUTPUT_VALID_FLAG1_RELEASE_LEDS BIT(3) #define DS_OUTPUT_VALID_FLAG1_PLAYER_INDICATOR_CONTROL_ENABLE BIT(4) #define DS_OUTPUT_VALID_FLAG2_LIGHTBAR_SETUP_CONTROL_ENABLE BIT(1) +#define DS_OUTPUT_VALID_FLAG2_COMPATIBLE_VIBRATION2 BIT(2) #define DS_OUTPUT_POWER_SAVE_CONTROL_MIC_MUTE BIT(4) #define DS_OUTPUT_LIGHTBAR_SETUP_LIGHT_OUT BIT(1) @@ -143,6 +147,9 @@ struct dualsense { struct input_dev *sensors; struct input_dev *touchpad; + /* Update version is used as a feature/capability version. */ + uint16_t update_version; + /* Calibration data for accelerometer and gyroscope. */ struct ps_calibration_data accel_calib_data[3]; struct ps_calibration_data gyro_calib_data[3]; @@ -153,6 +160,7 @@ struct dualsense { uint32_t sensor_timestamp_us; /* Compatible rumble state */ + bool use_vibration_v2; bool update_rumble; uint8_t motor_left; uint8_t motor_right; @@ -812,6 +820,15 @@ static int dualsense_get_firmware_info(struct dualsense *ds) ds->base.hw_version = get_unaligned_le32(&buf[24]); ds->base.fw_version = get_unaligned_le32(&buf[28]); + /* Update version is some kind of feature version. It is distinct from + * the firmware version as there can be many different variations of a + * controller over time with the same physical shell, but with different + * PCBs and other internal changes. The update version (internal name) is + * used as a means to detect what features are available and change behavior. + * Note: the version is different between DualSense and DualSense Edge. + */ + ds->update_version = get_unaligned_le16(&buf[44]); + err_free: kfree(buf); return ret; @@ -974,7 +991,10 @@ static void dualsense_output_worker(struct work_struct *work) if (ds->update_rumble) { /* Select classic rumble style haptics and enable it. */ common->valid_flag0 |= DS_OUTPUT_VALID_FLAG0_HAPTICS_SELECT; - common->valid_flag0 |= DS_OUTPUT_VALID_FLAG0_COMPATIBLE_VIBRATION; + if (ds->use_vibration_v2) + common->valid_flag2 |= DS_OUTPUT_VALID_FLAG2_COMPATIBLE_VIBRATION2; + else + common->valid_flag0 |= DS_OUTPUT_VALID_FLAG0_COMPATIBLE_VIBRATION; common->motor_left = ds->motor_left; common->motor_right = ds->motor_right; ds->update_rumble = false; @@ -1348,6 +1368,21 @@ static struct ps_device *dualsense_create(struct hid_device *hdev) return ERR_PTR(ret); } + /* Original DualSense firmware simulated classic controller rumble through + * its new haptics hardware. It felt different from classic rumble users + * were used to. Since then new firmwares were introduced to change behavior + * and make this new 'v2' behavior default on PlayStation and other platforms. + * The original DualSense requires a new enough firmware as bundled with PS5 + * software released in 2021. DualSense edge supports it out of the box. + * Both devices also support the old mode, but it is not really used. + */ + if (hdev->product == USB_DEVICE_ID_SONY_PS5_CONTROLLER) { + /* Feature version 2.21 introduced new vibration method. */ + ds->use_vibration_v2 = ds->update_version >= DS_FEATURE_VERSION(2, 21); + } else if (hdev->product == USB_DEVICE_ID_SONY_PS5_CONTROLLER_2) { + ds->use_vibration_v2 = true; + } + ret = ps_devices_list_add(ps_dev); if (ret) return ERR_PTR(ret); From 16c03440df4a1fc175b781ed7328d1af290b54e9 Mon Sep 17 00:00:00 2001 From: Farid Chahla Date: Wed, 14 Dec 2022 12:44:43 -0800 Subject: [PATCH 15/51] ANDROID: GKI: enable mulitcolor-led To enable newer version of DualSense driver, i.e. hid-playstation, we need to set LEDS_CLASS_MULTICOLOR to "y". Bug: 260685629 Change-Id: I52b0b1b6a061457e009b62a6bd6b66a91c8c37a2 Signed-off-by: Farid Chahla --- arch/arm64/configs/gki_defconfig | 1 + arch/x86/configs/gki_defconfig | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/arm64/configs/gki_defconfig b/arch/arm64/configs/gki_defconfig index da8ab23b9ce5..33b4ea90b6dd 100644 --- a/arch/arm64/configs/gki_defconfig +++ b/arch/arm64/configs/gki_defconfig @@ -513,6 +513,7 @@ CONFIG_MMC_CRYPTO=y CONFIG_MMC_SDHCI=y CONFIG_MMC_SDHCI_PLTFM=y CONFIG_LEDS_CLASS_FLASH=y +CONFIG_LEDS_CLASS_MULTICOLOR=y CONFIG_LEDS_TRIGGER_TIMER=y CONFIG_LEDS_TRIGGER_TRANSIENT=y CONFIG_EDAC=y diff --git a/arch/x86/configs/gki_defconfig b/arch/x86/configs/gki_defconfig index 14ea69c0d417..47edba3df354 100644 --- a/arch/x86/configs/gki_defconfig +++ b/arch/x86/configs/gki_defconfig @@ -464,6 +464,7 @@ CONFIG_MMC_CRYPTO=y CONFIG_MMC_SDHCI=y CONFIG_MMC_SDHCI_PLTFM=y CONFIG_LEDS_CLASS_FLASH=y +CONFIG_LEDS_CLASS_MULTICOLOR=y CONFIG_LEDS_TRIGGER_TIMER=y CONFIG_LEDS_TRIGGER_TRANSIENT=y CONFIG_EDAC=y From 134c1aae4311b6fb9823722647ab9c4ab554a152 Mon Sep 17 00:00:00 2001 From: Kalesh Singh Date: Mon, 19 Dec 2022 21:07:49 -0800 Subject: [PATCH 16/51] ANDROID: Make SPF aware of fast mremaps SPF attempts page faults without taking the mmap lock, but takes the PTL. If there is a concurrent fast mremap (at PMD/PUD level), this can lead to a UAF as fast mremap will only take the PTL locks at the PMD/PUD level. SPF cannot take the PTL locks at the larger subtree granularity since this introduces much contention in the page fault paths. To address the race: 1) Fast mremaps wait until there are no users of the VMA. 2) Speculative faults detect ongoing fast mremaps and fallback to conventional fault handling (taking mmap read lock). Since this race condition is very rare the performance impact is negligible. Bug: 263177905 Change-Id: If9755aa4261337fe180e3093a3cefaae8ac9ff1a Signed-off-by: Kalesh Singh --- include/linux/mm.h | 3 ++ mm/mmap.c | 30 ++++++++++++-- mm/mremap.c | 100 ++++++++++++++++++++++++++++++++++++++------- 3 files changed, 116 insertions(+), 17 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index dfefcfa1d6a4..5de4309bfa14 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1758,6 +1758,9 @@ int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, void *buf, int len, int write); #ifdef CONFIG_SPECULATIVE_PAGE_FAULT +extern wait_queue_head_t vma_users_wait; +extern atomic_t vma_user_waiters; + static inline void vm_write_begin(struct vm_area_struct *vma) { /* diff --git a/mm/mmap.c b/mm/mmap.c index fba57c628671..c3dfbfdb674a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -180,7 +180,17 @@ static void __free_vma(struct vm_area_struct *vma) #ifdef CONFIG_SPECULATIVE_PAGE_FAULT void put_vma(struct vm_area_struct *vma) { - if (atomic_dec_and_test(&vma->vm_ref_count)) + int ref_count = atomic_dec_return(&vma->vm_ref_count); + + /* + * Implicit smp_mb due to atomic_dec_return. + * + * If this is the last reference, wake up the mremap waiter + * (if any). + */ + if (ref_count == 1 && unlikely(atomic_read(&vma_user_waiters) > 0)) + wake_up(&vma_users_wait); + else if (ref_count <= 0) __free_vma(vma); } #else @@ -2421,8 +2431,22 @@ struct vm_area_struct *get_vma(struct mm_struct *mm, unsigned long addr) read_lock(&mm->mm_rb_lock); vma = __find_vma(mm, addr); - if (vma) - atomic_inc(&vma->vm_ref_count); + + /* + * If there is a concurrent fast mremap, bail out since the entire + * PMD/PUD subtree may have been remapped. + * + * This is usually safe for conventional mremap since it takes the + * PTE locks as does SPF. However fast mremap only takes the lock + * at the PMD/PUD level which is ok as it is done with the mmap + * write lock held. But since SPF, as the term implies forgoes, + * taking the mmap read lock and also cannot take PTL lock at the + * larger PMD/PUD granualrity, since it would introduce huge + * contention in the page fault path; fall back to regular fault + * handling. + */ + if (vma && !atomic_inc_unless_negative(&vma->vm_ref_count)) + vma = NULL; read_unlock(&mm->mm_rb_lock); return vma; diff --git a/mm/mremap.c b/mm/mremap.c index 5a18cec23fa7..0763b83ef779 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -210,17 +210,74 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, drop_rmap_locks(vma); } +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +DECLARE_WAIT_QUEUE_HEAD(vma_users_wait); +atomic_t vma_user_waiters = ATOMIC_INIT(0); + +static inline void wait_for_vma_users(struct vm_area_struct *vma) +{ + /* + * If we have the only reference, swap the refcount to -1. This + * will prevent other concurrent references by get_vma() for SPFs. + */ + if (likely(atomic_cmpxchg(&vma->vm_ref_count, 1, -1) == 1)) + return; + + /* Indicate we are waiting for other users of the VMA to finish. */ + atomic_inc(&vma_user_waiters); + + /* Failed atomic_cmpxchg; no implicit barrier, use an explicit one. */ + smp_mb(); + + /* + * Callers cannot handle failure, sleep uninterruptibly until there + * are no other users of this VMA. + * + * We don't need to worry about references from concurrent waiters, + * since this is only used in the context of fast mremaps, with + * exclusive mmap write lock held. + */ + wait_event(vma_users_wait, atomic_cmpxchg(&vma->vm_ref_count, 1, -1) == 1); + + atomic_dec(&vma_user_waiters); +} + + /* - * Speculative page fault handlers will not detect page table changes done - * without ptl locking. + * Restore the VMA reference count to 1 after a fast mremap. */ -#if defined(CONFIG_HAVE_MOVE_PMD) && !defined(CONFIG_SPECULATIVE_PAGE_FAULT) +static inline void restore_vma_ref_count(struct vm_area_struct *vma) +{ + /* + * This should only be called after a corresponding, + * wait_for_vma_users() + */ + VM_BUG_ON_VMA(atomic_cmpxchg(&vma->vm_ref_count, -1, 1) != -1, + vma); +} +#else /* !CONFIG_SPECULATIVE_PAGE_FAULT */ +static inline void wait_for_vma_users(struct vm_area_struct *vma) +{ +} +static inline void restore_vma_ref_count(struct vm_area_struct *vma) +{ +} +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ + +#ifdef CONFIG_HAVE_MOVE_PMD static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd) { spinlock_t *old_ptl, *new_ptl; struct mm_struct *mm = vma->vm_mm; pmd_t pmd; + bool ret; + + /* + * Wait for concurrent users, since these can potentially be + * speculative page faults. + */ + wait_for_vma_users(vma); /* * The destination pmd shouldn't be established, free_pgtables() @@ -245,8 +302,10 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, * One alternative might be to just unmap the target pmd at * this point, and verify that it really is empty. We'll see. */ - if (WARN_ON_ONCE(!pmd_none(*new_pmd))) - return false; + if (WARN_ON_ONCE(!pmd_none(*new_pmd))) { + ret = false; + goto out; + } /* * We don't have to worry about the ordering of src and dst @@ -270,7 +329,11 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, spin_unlock(new_ptl); spin_unlock(old_ptl); - return true; + ret = true; + +out: + restore_vma_ref_count(vma); + return ret; } #else static inline bool move_normal_pmd(struct vm_area_struct *vma, @@ -281,24 +344,29 @@ static inline bool move_normal_pmd(struct vm_area_struct *vma, } #endif -/* - * Speculative page fault handlers will not detect page table changes done - * without ptl locking. - */ -#if defined(CONFIG_HAVE_MOVE_PUD) && !defined(CONFIG_SPECULATIVE_PAGE_FAULT) +#ifdef CONFIG_HAVE_MOVE_PUD static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pud_t *old_pud, pud_t *new_pud) { spinlock_t *old_ptl, *new_ptl; struct mm_struct *mm = vma->vm_mm; pud_t pud; + bool ret; + + /* + * Wait for concurrent users, since these can potentially be + * speculative page faults. + */ + wait_for_vma_users(vma); /* * The destination pud shouldn't be established, free_pgtables() * should have released it. */ - if (WARN_ON_ONCE(!pud_none(*new_pud))) - return false; + if (WARN_ON_ONCE(!pud_none(*new_pud))) { + ret = false; + goto out; + } /* * We don't have to worry about the ordering of src and dst @@ -322,7 +390,11 @@ static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, spin_unlock(new_ptl); spin_unlock(old_ptl); - return true; + ret = true; + +out: + restore_vma_ref_count(vma); + return ret; } #else static inline bool move_normal_pud(struct vm_area_struct *vma, From 05a8f2c4d2f5b0664b97f3c426783cc26a946b3b Mon Sep 17 00:00:00 2001 From: Pradeep P V K Date: Mon, 6 Dec 2021 14:16:45 +0530 Subject: [PATCH 17/51] FROMLIST: fuse: give wakeup hints to the scheduler The synchronous wakeup interface is available only for the interruptible wakeup. Add it for normal wakeup and use this synchronous wakeup interface to wakeup the userspace daemon. Scheduler can make use of this hint to find a better CPU for the waker task. With this change the performance numbers for compress, decompress and copy use-cases on /sdcard path has improved by ~30%. Use-case details: 1. copy 10000 files of each 4k size into /sdcard path 2. use any File explorer application that has compress/decompress support 3. start compress/decompress and capture the time. ------------------------------------------------- | Default | wakeup support | Improvement/Diff | ------------------------------------------------- | 13.8 sec | 9.9 sec | 3.9 sec (28.26%) | ------------------------------------------------- Co-developed-by: Pavankumar Kondeti Signed-off-by: Pradeep P V K Bug: 216261533 Link: https://lore.kernel.org/lkml/1638780405-38026-1-git-send-email-quic_pragalla@quicinc.com/ Change-Id: I9ac89064e34b1e0605064bf4d2d3a310679cb605 Signed-off-by: Pradeep P V K Signed-off-by: Alessio Balsini (cherry picked from commit 30d72758dbe0e7fa9992f5d21ee8d23eec27934a) --- fs/fuse/dev.c | 21 ++++++++++++--------- fs/fuse/fuse_i.h | 6 +++--- fs/fuse/virtio_fs.c | 8 +++++--- include/linux/wait.h | 1 + 4 files changed, 21 insertions(+), 15 deletions(-) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index e9a1543ba7cb..47eef2496230 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -208,10 +208,13 @@ static unsigned int fuse_req_hash(u64 unique) /** * A new request is available, wake fiq->waitq */ -static void fuse_dev_wake_and_unlock(struct fuse_iqueue *fiq) +static void fuse_dev_wake_and_unlock(struct fuse_iqueue *fiq, bool sync) __releases(fiq->lock) { - wake_up(&fiq->waitq); + if (sync) + wake_up_sync(&fiq->waitq); + else + wake_up(&fiq->waitq); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); spin_unlock(&fiq->lock); } @@ -224,14 +227,14 @@ const struct fuse_iqueue_ops fuse_dev_fiq_ops = { EXPORT_SYMBOL_GPL(fuse_dev_fiq_ops); static void queue_request_and_unlock(struct fuse_iqueue *fiq, - struct fuse_req *req) + struct fuse_req *req, bool sync) __releases(fiq->lock) { req->in.h.len = sizeof(struct fuse_in_header) + fuse_len_args(req->args->in_numargs, (struct fuse_arg *) req->args->in_args); list_add_tail(&req->list, &fiq->pending); - fiq->ops->wake_pending_and_unlock(fiq); + fiq->ops->wake_pending_and_unlock(fiq, sync); } void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget, @@ -246,7 +249,7 @@ void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget, if (fiq->connected) { fiq->forget_list_tail->next = forget; fiq->forget_list_tail = forget; - fiq->ops->wake_forget_and_unlock(fiq); + fiq->ops->wake_forget_and_unlock(fiq, false); } else { kfree(forget); spin_unlock(&fiq->lock); @@ -266,7 +269,7 @@ static void flush_bg_queue(struct fuse_conn *fc) fc->active_background++; spin_lock(&fiq->lock); req->in.h.unique = fuse_get_unique(fiq); - queue_request_and_unlock(fiq, req); + queue_request_and_unlock(fiq, req, false); } } @@ -359,7 +362,7 @@ static int queue_interrupt(struct fuse_req *req) spin_unlock(&fiq->lock); return 0; } - fiq->ops->wake_interrupt_and_unlock(fiq); + fiq->ops->wake_interrupt_and_unlock(fiq, false); } else { spin_unlock(&fiq->lock); } @@ -426,7 +429,7 @@ static void __fuse_request_send(struct fuse_req *req) /* acquire extra reference, since request is still needed after fuse_request_end() */ __fuse_get_request(req); - queue_request_and_unlock(fiq, req); + queue_request_and_unlock(fiq, req, true); request_wait_answer(req); /* Pairs with smp_wmb() in fuse_request_end() */ @@ -601,7 +604,7 @@ static int fuse_simple_notify_reply(struct fuse_mount *fm, spin_lock(&fiq->lock); if (fiq->connected) { - queue_request_and_unlock(fiq, req); + queue_request_and_unlock(fiq, req, false); } else { err = -ENODEV; spin_unlock(&fiq->lock); diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 69a631c12b15..e20c341864b5 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -412,19 +412,19 @@ struct fuse_iqueue_ops { /** * Signal that a forget has been queued */ - void (*wake_forget_and_unlock)(struct fuse_iqueue *fiq) + void (*wake_forget_and_unlock)(struct fuse_iqueue *fiq, bool sync) __releases(fiq->lock); /** * Signal that an INTERRUPT request has been queued */ - void (*wake_interrupt_and_unlock)(struct fuse_iqueue *fiq) + void (*wake_interrupt_and_unlock)(struct fuse_iqueue *fiq, bool sync) __releases(fiq->lock); /** * Signal that a request has been queued */ - void (*wake_pending_and_unlock)(struct fuse_iqueue *fiq) + void (*wake_pending_and_unlock)(struct fuse_iqueue *fiq, bool sync) __releases(fiq->lock); /** diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index b9cfb1165ff4..90a574ba92cf 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -971,7 +971,7 @@ static struct virtio_driver virtio_fs_driver = { #endif }; -static void virtio_fs_wake_forget_and_unlock(struct fuse_iqueue *fiq) +static void virtio_fs_wake_forget_and_unlock(struct fuse_iqueue *fiq, bool sync) __releases(fiq->lock) { struct fuse_forget_link *link; @@ -1006,7 +1006,8 @@ __releases(fiq->lock) kfree(link); } -static void virtio_fs_wake_interrupt_and_unlock(struct fuse_iqueue *fiq) +static void virtio_fs_wake_interrupt_and_unlock(struct fuse_iqueue *fiq, + bool sync) __releases(fiq->lock) { /* @@ -1221,7 +1222,8 @@ out: return ret; } -static void virtio_fs_wake_pending_and_unlock(struct fuse_iqueue *fiq) +static void virtio_fs_wake_pending_and_unlock(struct fuse_iqueue *fiq, + bool sync) __releases(fiq->lock) { unsigned int queue_id = VQ_REQUEST; /* TODO multiqueue */ diff --git a/include/linux/wait.h b/include/linux/wait.h index 1663e47681a3..e9966f3929f6 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -219,6 +219,7 @@ void __wake_up_pollfree(struct wait_queue_head *wq_head); #define wake_up_interruptible_nr(x, nr) __wake_up(x, TASK_INTERRUPTIBLE, nr, NULL) #define wake_up_interruptible_all(x) __wake_up(x, TASK_INTERRUPTIBLE, 0, NULL) #define wake_up_interruptible_sync(x) __wake_up_sync((x), TASK_INTERRUPTIBLE) +#define wake_up_sync(x) __wake_up_sync((x), TASK_NORMAL) /* * Wakeup macros to be used to report events to the targets. From 7bc2b8c400c0abae519fa73f0bed2fe8e844322e Mon Sep 17 00:00:00 2001 From: Mayank Rana Date: Wed, 18 May 2022 11:12:52 -0700 Subject: [PATCH 18/51] UPSTREAM: usb: dwc3: core: Add error log when core soft reset failed DWC3 controller soft reset is important operation for USB functionality. In case when it fails, currently there is no failure log. Hence add error log when core soft reset failed. Signed-off-by: Mayank Rana Signed-off-by: Greg Kroah-Hartman Bug: 235863377 (cherry picked from commit 859bdc359567f5fa8e8dc780d7b5e53ea43d9ce9) Change-Id: I60500f66af47d93cf9d60bdecab32e6dc48d4b7c Signed-off-by: Mayank Rana (cherry picked from commit d03bf01b43ff923e066938895a6867338778be7a) --- drivers/usb/dwc3/core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 83a850038c51..40d358ca8b58 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -301,6 +301,7 @@ int dwc3_core_soft_reset(struct dwc3 *dwc) udelay(1); } while (--retries); + dev_warn(dwc->dev, "DWC3 controller soft reset failed.\n"); return -ETIMEDOUT; done: From 9e6fb5ac724218ac74f83762a4cc2e34e1158b21 Mon Sep 17 00:00:00 2001 From: Kever Yang Date: Sun, 18 Dec 2022 18:43:21 +0800 Subject: [PATCH 19/51] ANDROID: GKI: rockchip: Add symbol clk_hw_set_parent Leaf changes summary: 1 artifact changed Changed leaf types summary: 0 leaf type changed Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added function Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable 1 Added function: [A] 'function int clk_hw_set_parent(clk_hw*, clk_hw*)' Bug: 239396464 Signed-off-by: Kever Yang Change-Id: Id24cdc739a6254f7676a1f60fd8ecbd0066ca4b0 --- android/abi_gki_aarch64.xml | 6 ++++++ android/abi_gki_aarch64_rockchip | 3 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/android/abi_gki_aarch64.xml b/android/abi_gki_aarch64.xml index ba1c34e4205c..15fe4032346b 100644 --- a/android/abi_gki_aarch64.xml +++ b/android/abi_gki_aarch64.xml @@ -1173,6 +1173,7 @@ + @@ -122879,6 +122880,11 @@ + + + + + diff --git a/android/abi_gki_aarch64_rockchip b/android/abi_gki_aarch64_rockchip index c1df5fc9548f..6489a344910d 100644 --- a/android/abi_gki_aarch64_rockchip +++ b/android/abi_gki_aarch64_rockchip @@ -98,6 +98,7 @@ clk_hw_get_flags clk_hw_get_name clk_hw_get_parent + clk_hw_get_parent_by_index clk_hw_get_rate __clk_mux_determine_rate clk_notifier_register @@ -1596,7 +1597,6 @@ # required by clk-rockchip-regmap.ko clk_hw_get_num_parents - clk_hw_get_parent_by_index divider_recalc_rate divider_round_rate_parent @@ -1608,6 +1608,7 @@ __clk_get_hw clk_hw_register_composite clk_hw_round_rate + clk_hw_set_parent clk_mux_ops clk_mux_ro_ops clk_register_divider_table From a43cd1f2bb8e2a4b6b1943faccf05198afbbd9e9 Mon Sep 17 00:00:00 2001 From: Suren Baghdasaryan Date: Thu, 8 Dec 2022 17:26:26 +0000 Subject: [PATCH 20/51] Revert "Revert "ANDROID: vendor_hooks:vendor hook for __alloc_pages_slowpath."" This reverts commit cc51dcbc60c4492d68e3b075ff4d8bd61729dae4. Reason for revert: The vendor hooks were reverted but they are needed. Bug: 243629905 Signed-off-by: xiaofeng Signed-off-by: Suren Baghdasaryan Change-Id: I4b2eab1a9bf3bbbb200f9d09f2c57fb4d9f2c143 --- drivers/android/vendor_hooks.c | 2 ++ include/trace/hooks/mm.h | 8 ++++++++ mm/page_alloc.c | 11 +++++++++++ 3 files changed, 21 insertions(+) diff --git a/drivers/android/vendor_hooks.c b/drivers/android/vendor_hooks.c index 91288f59a390..ebcc005090c1 100644 --- a/drivers/android/vendor_hooks.c +++ b/drivers/android/vendor_hooks.c @@ -465,6 +465,8 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_alloc_si); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_free_pages); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_set_shmem_page_flag); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_sched_pelt_multiplier); +EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_alloc_pages_reclaim_bypass); +EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_alloc_pages_failure_bypass); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_check_page_look_around_ref); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_look_around); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_look_around_migrate_page); diff --git a/include/trace/hooks/mm.h b/include/trace/hooks/mm.h index 3f32c876441f..a4b855e0a81a 100644 --- a/include/trace/hooks/mm.h +++ b/include/trace/hooks/mm.h @@ -287,6 +287,14 @@ DECLARE_HOOK(android_vh_set_shmem_page_flag, DECLARE_HOOK(android_vh_remove_vmalloc_stack, TP_PROTO(struct vm_struct *vm), TP_ARGS(vm)); +DECLARE_HOOK(android_vh_alloc_pages_reclaim_bypass, + TP_PROTO(gfp_t gfp_mask, int order, int alloc_flags, + int migratetype, struct page **page), + TP_ARGS(gfp_mask, order, alloc_flags, migratetype, page)); +DECLARE_HOOK(android_vh_alloc_pages_failure_bypass, + TP_PROTO(gfp_t gfp_mask, int order, int alloc_flags, + int migratetype, struct page **page), + TP_ARGS(gfp_mask, order, alloc_flags, migratetype, page)); DECLARE_HOOK(android_vh_test_clear_look_around_ref, TP_PROTO(struct page *page), TP_ARGS(page)); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 337fba577a09..18a969923841 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4956,6 +4956,12 @@ retry: if (current->flags & PF_MEMALLOC) goto nopage; + trace_android_vh_alloc_pages_reclaim_bypass(gfp_mask, order, + alloc_flags, ac->migratetype, &page); + + if (page) + goto got_pg; + /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac, &did_some_progress); @@ -5071,6 +5077,11 @@ nopage: goto retry; } fail: + trace_android_vh_alloc_pages_failure_bypass(gfp_mask, order, + alloc_flags, ac->migratetype, &page); + if (page) + goto got_pg; + warn_alloc(gfp_mask, ac->nodemask, "page allocation failure: order:%u", order); got_pg: From f677efbea129421d02bbba6b5856985f451d08ed Mon Sep 17 00:00:00 2001 From: Suren Baghdasaryan Date: Wed, 14 Dec 2022 22:08:04 +0000 Subject: [PATCH 21/51] Revert "Revert "ANDROID: vendor_hooks:vendor hook for mmput"" This reverts commit 501063ce66dd386c16c47c463c8a3df0e810435f. Reason for revert: The vendor hook is actually needed by a partner Bug: 238821038 Signed-off-by: Suren Baghdasaryan Change-Id: I1c19add348792967975369a10ec9cb41fa268236 --- drivers/android/vendor_hooks.c | 1 + include/trace/hooks/sched.h | 4 ++++ kernel/fork.c | 4 +++- 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/android/vendor_hooks.c b/drivers/android/vendor_hooks.c index ebcc005090c1..3ff54d2aa04d 100644 --- a/drivers/android/vendor_hooks.c +++ b/drivers/android/vendor_hooks.c @@ -464,6 +464,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(android_rvh_alloc_si); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_alloc_si); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_free_pages); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_set_shmem_page_flag); +EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_mmput); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_sched_pelt_multiplier); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_alloc_pages_reclaim_bypass); EXPORT_TRACEPOINT_SYMBOL_GPL(android_vh_alloc_pages_failure_bypass); diff --git a/include/trace/hooks/sched.h b/include/trace/hooks/sched.h index 6488dee32a88..5f63128bb5c6 100644 --- a/include/trace/hooks/sched.h +++ b/include/trace/hooks/sched.h @@ -391,6 +391,10 @@ DECLARE_HOOK(android_vh_setscheduler_uclamp, TP_PROTO(struct task_struct *tsk, int clamp_id, unsigned int value), TP_ARGS(tsk, clamp_id, value)); +DECLARE_HOOK(android_vh_mmput, + TP_PROTO(void *unused), + TP_ARGS(unused)); + DECLARE_HOOK(android_vh_sched_pelt_multiplier, TP_PROTO(unsigned int old, unsigned int cur, int *ret), TP_ARGS(old, cur, ret)); diff --git a/kernel/fork.c b/kernel/fork.c index b2ab89589a7d..d515aa5b7eb5 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1150,8 +1150,10 @@ void mmput(struct mm_struct *mm) { might_sleep(); - if (atomic_dec_and_test(&mm->mm_users)) + if (atomic_dec_and_test(&mm->mm_users)) { + trace_android_vh_mmput(NULL); __mmput(mm); + } } EXPORT_SYMBOL_GPL(mmput); From ad1f2eebadfe1d409f9bdf17d3d5e53a9dd724c7 Mon Sep 17 00:00:00 2001 From: xiaofeng Date: Wed, 14 Dec 2022 16:57:31 +0800 Subject: [PATCH 22/51] ANDROID: GKI: update xiaomi symbol list Leaf changes summary: 6 artifacts changed Changed leaf types summary: 0 leaf type changed Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 3 Added functions Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 3 Added variables 3 Added functions: [A] 'function int __traceiter_android_vh_alloc_pages_failure_bypass(void*, gfp_t, int, int, int, page**)' [A] 'function int __traceiter_android_vh_alloc_pages_reclaim_bypass(void*, gfp_t, int, int, int, page**)' [A] 'function int __traceiter_android_vh_mmput(void*, mm_struct*)' 3 Added variables: [A] 'tracepoint __tracepoint_android_vh_alloc_pages_failure_bypass' [A] 'tracepoint __tracepoint_android_vh_alloc_pages_reclaim_bypass' [A] 'tracepoint __tracepoint_android_vh_mmput' Bug:262486564 Change-Id: I7c78cc3c5fde8a15c8a30073fcb1cb01708d9d37 Signed-off-by: xiaofeng --- android/abi_gki_aarch64.xml | 322 ++++++++++++++++++--------------- android/abi_gki_aarch64_xiaomi | 8 + 2 files changed, 185 insertions(+), 145 deletions(-) diff --git a/android/abi_gki_aarch64.xml b/android/abi_gki_aarch64.xml index 15fe4032346b..5477e84597cf 100644 --- a/android/abi_gki_aarch64.xml +++ b/android/abi_gki_aarch64.xml @@ -406,6 +406,8 @@ + + @@ -544,6 +546,7 @@ + @@ -6369,6 +6372,8 @@ + + @@ -6512,6 +6517,7 @@ + @@ -38009,63 +38015,63 @@ - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + @@ -53461,13 +53467,13 @@ - + - + - + @@ -60736,24 +60742,24 @@ - + - + - + - + - + - + - + @@ -101122,21 +101128,21 @@ - + - + - + - + - + - + @@ -115571,9 +115577,9 @@ - + - + @@ -115602,11 +115608,11 @@ - - - - - + + + + + @@ -116242,9 +116248,9 @@ - - - + + + @@ -116297,9 +116303,9 @@ - - - + + + @@ -116810,9 +116816,9 @@ - - - + + + @@ -118005,6 +118011,24 @@ + + + + + + + + + + + + + + + + + + @@ -118789,18 +118813,18 @@ - - - - - - + + + + + + - - - - + + + + @@ -118904,6 +118928,11 @@ + + + + + @@ -119137,11 +119166,11 @@ - - - - - + + + + + @@ -119368,9 +119397,9 @@ - - - + + + @@ -120096,6 +120125,8 @@ + + @@ -120220,8 +120251,8 @@ - - + + @@ -120239,6 +120270,7 @@ + @@ -120288,7 +120320,7 @@ - + @@ -120330,7 +120362,7 @@ - + @@ -120939,9 +120971,9 @@ - - - + + + @@ -121065,9 +121097,9 @@ - - - + + + @@ -122880,9 +122912,9 @@ - - - + + + @@ -126734,22 +126766,22 @@ - - + + - - - + + + - - + + - - - + + + @@ -130091,9 +130123,9 @@ - - - + + + @@ -130145,9 +130177,9 @@ - - - + + + @@ -130266,14 +130298,14 @@ - - - + + + - - - + + + @@ -131149,12 +131181,12 @@ - - + + - - + + @@ -131173,12 +131205,12 @@ - - - - - - + + + + + + @@ -131221,8 +131253,8 @@ - - + + @@ -137059,14 +137091,14 @@ - - - - + + + + - - + + @@ -142086,11 +142118,11 @@ - + - - + + @@ -145708,21 +145740,21 @@ - - - + + + - - + + - - + + - - + + @@ -145732,13 +145764,13 @@ - - - + + + - - + + @@ -148372,9 +148404,9 @@ - - - + + + @@ -148386,8 +148418,8 @@ - - + + diff --git a/android/abi_gki_aarch64_xiaomi b/android/abi_gki_aarch64_xiaomi index a162683958cd..ebbbb0e81420 100644 --- a/android/abi_gki_aarch64_xiaomi +++ b/android/abi_gki_aarch64_xiaomi @@ -200,3 +200,11 @@ wakeup_sources_read_unlock wakeup_sources_walk_start wakeup_sources_walk_next + +#required by mi_mempool.ko module + __traceiter_android_vh_mmput + __tracepoint_android_vh_mmput + __traceiter_android_vh_alloc_pages_reclaim_bypass + __tracepoint_android_vh_alloc_pages_reclaim_bypass + __traceiter_android_vh_alloc_pages_failure_bypass + __tracepoint_android_vh_alloc_pages_failure_bypass From 6d015667ce74bf8a3cd2400ad6516f24c6a827d6 Mon Sep 17 00:00:00 2001 From: Bing-Jhong Billy Jheng Date: Thu, 15 Dec 2022 06:43:56 -0800 Subject: [PATCH 23/51] UPSTREAM: io_uring: add missing item types for splice request Splice is like read/write and should grab current->nsproxy, denoted by IO_WQ_WORK_FILES as it refers to current->files as well Change-Id: I94a99fdef5764e7eda5da778b5b52a150b9fe5eb Signed-off-by: Bing-Jhong Billy Jheng Reviewed-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman (cherry picked from commit 75454b4bbfc7e6a4dd8338556f36ea9107ddf61a) Signed-off-by: Greg Kroah-Hartman --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index a952288b2ab8..5538906e47fe 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -935,7 +935,7 @@ static const struct io_op_def io_op_defs[] = { .needs_file = 1, .hash_reg_file = 1, .unbound_nonreg_file = 1, - .work_flags = IO_WQ_WORK_BLKCG, + .work_flags = IO_WQ_WORK_BLKCG | IO_WQ_WORK_FILES, }, [IORING_OP_PROVIDE_BUFFERS] = {}, [IORING_OP_REMOVE_BUFFERS] = {}, From fe60669d0308bfdd1423976ac24b54c2c386f736 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Thu, 15 Dec 2022 21:53:37 +0000 Subject: [PATCH 24/51] ANDROID: fips140: add dump_jitterentropy command to fips140_lab_util For the entropy analysis, we must provide some output from the Jitter RNG: a large amount of output from one instance, and a smaller amount of output from each of a certain number of instances. The original plan was to use a build of the userspace jitterentropy library that matches the kernel's jitterentropy_rng as closely as possible. However, it's now being requested that the output be gotten from the kernel instead. Now that fips140_lab_util depends on AF_ALG anyway, it's straightforward to dump output from jitterentropy_rng instances using AF_ALG. Therefore, add a command dump_jitterentropy which supports this. Bug: 188620248 Change-Id: I78eb26250e88f2fc28fc44aa201acbe5b84df8bb Signed-off-by: Eric Biggers (cherry picked from commit dc015032666ece133065b7fea73f5709f735c9b0) --- samples/crypto/fips140_lab_util.c | 88 +++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) diff --git a/samples/crypto/fips140_lab_util.c b/samples/crypto/fips140_lab_util.c index 996839dbd2e3..5f8e9018013a 100644 --- a/samples/crypto/fips140_lab_util.c +++ b/samples/crypto/fips140_lab_util.c @@ -24,6 +24,7 @@ #include #include +#include #include #include #include @@ -45,6 +46,8 @@ * ---------------------------------------------------------------------------*/ #define ARRAY_SIZE(A) (sizeof(A) / sizeof((A)[0])) +#define MIN(a, b) ((a) < (b) ? (a) : (b)) +#define MAX(a, b) ((a) > (b) ? (a) : (b)) static void __attribute__((noreturn)) do_die(const char *format, va_list va, int err) @@ -109,6 +112,23 @@ static const char *bytes_to_hex(const uint8_t *bytes, size_t count) return hex; } +static void full_write(int fd, const void *buf, size_t count) +{ + while (count) { + ssize_t ret = write(fd, buf, count); + + if (ret < 0) + die_errno("write failed"); + buf += ret; + count -= ret; + } +} + +enum { + OPT_AMOUNT, + OPT_ITERATIONS, +}; + static void usage(void); /* --------------------------------------------------------------------------- @@ -226,6 +246,68 @@ static int get_req_fd(int alg_fd, const char *alg_name) return req_fd; } +/* --------------------------------------------------------------------------- + * dump_jitterentropy command + * ---------------------------------------------------------------------------*/ + +static void dump_from_jent_fd(int fd, size_t count) +{ + uint8_t buf[AF_ALG_MAX_RNG_REQUEST_SIZE]; + + while (count) { + ssize_t ret; + + memset(buf, 0, sizeof(buf)); + ret = read(fd, buf, MIN(count, sizeof(buf))); + if (ret < 0) + die_errno("error reading from jitterentropy_rng"); + full_write(STDOUT_FILENO, buf, ret); + count -= ret; + } +} + +static int cmd_dump_jitterentropy(int argc, char *argv[]) +{ + static const struct option longopts[] = { + { "amount", required_argument, NULL, OPT_AMOUNT }, + { "iterations", required_argument, NULL, OPT_ITERATIONS }, + { NULL, 0, NULL, 0 }, + }; + size_t amount = 128; + size_t iterations = 1; + size_t i; + int c; + + while ((c = getopt_long(argc, argv, "", longopts, NULL)) != -1) { + switch (c) { + case OPT_AMOUNT: + amount = strtoul(optarg, NULL, 0); + if (amount <= 0 || amount >= ULONG_MAX) + die("invalid argument to --amount"); + break; + case OPT_ITERATIONS: + iterations = strtoul(optarg, NULL, 0); + if (iterations <= 0 || iterations >= ULONG_MAX) + die("invalid argument to --iterations"); + break; + default: + usage(); + return 1; + } + } + + for (i = 0; i < iterations; i++) { + int alg_fd = get_alg_fd("rng", "jitterentropy_rng"); + int req_fd = get_req_fd(alg_fd, "jitterentropy_rng"); + + dump_from_jent_fd(req_fd, amount); + + close(req_fd); + close(alg_fd); + } + return 0; +} + /* --------------------------------------------------------------------------- * show_invalid_inputs command * ---------------------------------------------------------------------------*/ @@ -510,6 +592,7 @@ static const struct command { const char *name; int (*func)(int argc, char *argv[]); } commands[] = { + { "dump_jitterentropy", cmd_dump_jitterentropy }, { "show_invalid_inputs", cmd_show_invalid_inputs }, { "show_module_version", cmd_show_module_version }, { "show_service_indicators", cmd_show_service_indicators }, @@ -519,9 +602,14 @@ static void usage(void) { fprintf(stderr, "Usage:\n" +" fips140_lab_util dump_jitterentropy [OPTION]...\n" " fips140_lab_util show_invalid_inputs\n" " fips140_lab_util show_module_version\n" " fips140_lab_util show_service_indicators [SERVICE]...\n" +"\n" +"Options for dump_jitterentropy:\n" +" --amount=AMOUNT Amount to dump in bytes per iteration (default 128)\n" +" --iterations=COUNT Number of start-up iterations (default 1)\n" ); } From 135145909741b7d15018babc3f3d45e0c308aa7b Mon Sep 17 00:00:00 2001 From: Wei Liu Date: Wed, 28 Dec 2022 14:39:41 +0800 Subject: [PATCH 25/51] ANDROID: GKI: Update symbols to symbol list Update symbols to symbol list externed by oppo network group. 1 Added function: [A] 'function int __rtnl_link_register(rtnl_link_ops*)' Bug: 193384408 Signed-off-by: Wei Liu Change-Id: Ibd8f74fa1f3b68047f6fed9b5c4154c51f23b821 --- android/abi_gki_aarch64.xml | 121 +++++++++++++++++++++++++++++++++- android/abi_gki_aarch64_oplus | 1 + 2 files changed, 120 insertions(+), 2 deletions(-) diff --git a/android/abi_gki_aarch64.xml b/android/abi_gki_aarch64.xml index 5477e84597cf..49832c52416e 100644 --- a/android/abi_gki_aarch64.xml +++ b/android/abi_gki_aarch64.xml @@ -255,6 +255,7 @@ + @@ -13101,6 +13102,26 @@ + + + + + + + + + + + + + + + + + + + + @@ -22429,6 +22450,14 @@ + + + + + + + + @@ -69208,7 +69237,20 @@ - + + + + + + + + + + + + + + @@ -70110,6 +70152,7 @@ + @@ -76600,6 +76643,7 @@ + @@ -88213,7 +88257,65 @@ - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -112859,6 +112961,17 @@ + + + + + + + + + + + @@ -117062,6 +117175,10 @@ + + + + diff --git a/android/abi_gki_aarch64_oplus b/android/abi_gki_aarch64_oplus index ac8c1165135b..d1c65a2281cb 100644 --- a/android/abi_gki_aarch64_oplus +++ b/android/abi_gki_aarch64_oplus @@ -2297,6 +2297,7 @@ rtc_update_irq rtc_valid_tm rtnl_is_locked + __rtnl_link_register __rtnl_link_unregister rtnl_lock rtnl_unlock From c67f268c849edcebe3bccbb8797da6c31b7f1bb5 Mon Sep 17 00:00:00 2001 From: Kalesh Singh Date: Fri, 6 Jan 2023 13:07:18 -0800 Subject: [PATCH 26/51] Revert "ANDROID: Make SPF aware of fast mremaps" This reverts commit 134c1aae4311b6fb9823722647ab9c4ab554a152. Reason for revert: vts_linux_kselftest_arm_64 timeout Bug: 263479421 Bug: 263177905 Change-Id: I123c56741c982d1539ceebd8bfde2443871aa1de Signed-off-by: Kalesh Singh --- include/linux/mm.h | 3 -- mm/mmap.c | 30 ++------------ mm/mremap.c | 100 +++++++-------------------------------------- 3 files changed, 17 insertions(+), 116 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 5de4309bfa14..dfefcfa1d6a4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1758,9 +1758,6 @@ int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, void *buf, int len, int write); #ifdef CONFIG_SPECULATIVE_PAGE_FAULT -extern wait_queue_head_t vma_users_wait; -extern atomic_t vma_user_waiters; - static inline void vm_write_begin(struct vm_area_struct *vma) { /* diff --git a/mm/mmap.c b/mm/mmap.c index c3dfbfdb674a..fba57c628671 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -180,17 +180,7 @@ static void __free_vma(struct vm_area_struct *vma) #ifdef CONFIG_SPECULATIVE_PAGE_FAULT void put_vma(struct vm_area_struct *vma) { - int ref_count = atomic_dec_return(&vma->vm_ref_count); - - /* - * Implicit smp_mb due to atomic_dec_return. - * - * If this is the last reference, wake up the mremap waiter - * (if any). - */ - if (ref_count == 1 && unlikely(atomic_read(&vma_user_waiters) > 0)) - wake_up(&vma_users_wait); - else if (ref_count <= 0) + if (atomic_dec_and_test(&vma->vm_ref_count)) __free_vma(vma); } #else @@ -2431,22 +2421,8 @@ struct vm_area_struct *get_vma(struct mm_struct *mm, unsigned long addr) read_lock(&mm->mm_rb_lock); vma = __find_vma(mm, addr); - - /* - * If there is a concurrent fast mremap, bail out since the entire - * PMD/PUD subtree may have been remapped. - * - * This is usually safe for conventional mremap since it takes the - * PTE locks as does SPF. However fast mremap only takes the lock - * at the PMD/PUD level which is ok as it is done with the mmap - * write lock held. But since SPF, as the term implies forgoes, - * taking the mmap read lock and also cannot take PTL lock at the - * larger PMD/PUD granualrity, since it would introduce huge - * contention in the page fault path; fall back to regular fault - * handling. - */ - if (vma && !atomic_inc_unless_negative(&vma->vm_ref_count)) - vma = NULL; + if (vma) + atomic_inc(&vma->vm_ref_count); read_unlock(&mm->mm_rb_lock); return vma; diff --git a/mm/mremap.c b/mm/mremap.c index 0763b83ef779..5a18cec23fa7 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -210,74 +210,17 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, drop_rmap_locks(vma); } -#ifdef CONFIG_SPECULATIVE_PAGE_FAULT -DECLARE_WAIT_QUEUE_HEAD(vma_users_wait); -atomic_t vma_user_waiters = ATOMIC_INIT(0); - -static inline void wait_for_vma_users(struct vm_area_struct *vma) -{ - /* - * If we have the only reference, swap the refcount to -1. This - * will prevent other concurrent references by get_vma() for SPFs. - */ - if (likely(atomic_cmpxchg(&vma->vm_ref_count, 1, -1) == 1)) - return; - - /* Indicate we are waiting for other users of the VMA to finish. */ - atomic_inc(&vma_user_waiters); - - /* Failed atomic_cmpxchg; no implicit barrier, use an explicit one. */ - smp_mb(); - - /* - * Callers cannot handle failure, sleep uninterruptibly until there - * are no other users of this VMA. - * - * We don't need to worry about references from concurrent waiters, - * since this is only used in the context of fast mremaps, with - * exclusive mmap write lock held. - */ - wait_event(vma_users_wait, atomic_cmpxchg(&vma->vm_ref_count, 1, -1) == 1); - - atomic_dec(&vma_user_waiters); -} - - /* - * Restore the VMA reference count to 1 after a fast mremap. + * Speculative page fault handlers will not detect page table changes done + * without ptl locking. */ -static inline void restore_vma_ref_count(struct vm_area_struct *vma) -{ - /* - * This should only be called after a corresponding, - * wait_for_vma_users() - */ - VM_BUG_ON_VMA(atomic_cmpxchg(&vma->vm_ref_count, -1, 1) != -1, - vma); -} -#else /* !CONFIG_SPECULATIVE_PAGE_FAULT */ -static inline void wait_for_vma_users(struct vm_area_struct *vma) -{ -} -static inline void restore_vma_ref_count(struct vm_area_struct *vma) -{ -} -#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ - -#ifdef CONFIG_HAVE_MOVE_PMD +#if defined(CONFIG_HAVE_MOVE_PMD) && !defined(CONFIG_SPECULATIVE_PAGE_FAULT) static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd) { spinlock_t *old_ptl, *new_ptl; struct mm_struct *mm = vma->vm_mm; pmd_t pmd; - bool ret; - - /* - * Wait for concurrent users, since these can potentially be - * speculative page faults. - */ - wait_for_vma_users(vma); /* * The destination pmd shouldn't be established, free_pgtables() @@ -302,10 +245,8 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, * One alternative might be to just unmap the target pmd at * this point, and verify that it really is empty. We'll see. */ - if (WARN_ON_ONCE(!pmd_none(*new_pmd))) { - ret = false; - goto out; - } + if (WARN_ON_ONCE(!pmd_none(*new_pmd))) + return false; /* * We don't have to worry about the ordering of src and dst @@ -329,11 +270,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, spin_unlock(new_ptl); spin_unlock(old_ptl); - ret = true; - -out: - restore_vma_ref_count(vma); - return ret; + return true; } #else static inline bool move_normal_pmd(struct vm_area_struct *vma, @@ -344,29 +281,24 @@ static inline bool move_normal_pmd(struct vm_area_struct *vma, } #endif -#ifdef CONFIG_HAVE_MOVE_PUD +/* + * Speculative page fault handlers will not detect page table changes done + * without ptl locking. + */ +#if defined(CONFIG_HAVE_MOVE_PUD) && !defined(CONFIG_SPECULATIVE_PAGE_FAULT) static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pud_t *old_pud, pud_t *new_pud) { spinlock_t *old_ptl, *new_ptl; struct mm_struct *mm = vma->vm_mm; pud_t pud; - bool ret; - - /* - * Wait for concurrent users, since these can potentially be - * speculative page faults. - */ - wait_for_vma_users(vma); /* * The destination pud shouldn't be established, free_pgtables() * should have released it. */ - if (WARN_ON_ONCE(!pud_none(*new_pud))) { - ret = false; - goto out; - } + if (WARN_ON_ONCE(!pud_none(*new_pud))) + return false; /* * We don't have to worry about the ordering of src and dst @@ -390,11 +322,7 @@ static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, spin_unlock(new_ptl); spin_unlock(old_ptl); - ret = true; - -out: - restore_vma_ref_count(vma); - return ret; + return true; } #else static inline bool move_normal_pud(struct vm_area_struct *vma, From 529351c4c8202aa7f5bc4a8a100e583a70ab6110 Mon Sep 17 00:00:00 2001 From: Kalesh Singh Date: Mon, 19 Dec 2022 21:07:49 -0800 Subject: [PATCH 27/51] ANDROID: Re-enable fast mremap and fix UAF with SPF SPF attempts page faults without taking the mmap lock, but takes the PTL. If there is a concurrent fast mremap (at PMD/PUD level), this can lead to a UAF as fast mremap will only take the PTL locks at the PMD/PUD level. SPF cannot take the PTL locks at the larger subtree granularity since this introduces much contention in the page fault paths. To address the race: 1) Only try fast mremaps if there are no users of the VMA. Android is concerned with this optimization in the context of GC stop-the-world pause. So there are no other threads active and this should almost always succeed. 2) Speculative faults detect ongoing fast mremaps and fallback to conventional fault handling (taking mmap read lock). Bug: 263177905 Change-Id: I23917e493ddc8576de19883cac053dfde9982b7f Signed-off-by: Kalesh Singh --- mm/mmap.c | 18 +++++++++++++++-- mm/mremap.c | 58 +++++++++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 66 insertions(+), 10 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index fba57c628671..dd52d8764c2f 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2421,8 +2421,22 @@ struct vm_area_struct *get_vma(struct mm_struct *mm, unsigned long addr) read_lock(&mm->mm_rb_lock); vma = __find_vma(mm, addr); - if (vma) - atomic_inc(&vma->vm_ref_count); + + /* + * If there is a concurrent fast mremap, bail out since the entire + * PMD/PUD subtree may have been remapped. + * + * This is usually safe for conventional mremap since it takes the + * PTE locks as does SPF. However fast mremap only takes the lock + * at the PMD/PUD level which is ok as it is done with the mmap + * write lock held. But since SPF, as the term implies forgoes, + * taking the mmap read lock and also cannot take PTL lock at the + * larger PMD/PUD granualrity, since it would introduce huge + * contention in the page fault path; fall back to regular fault + * handling. + */ + if (vma && !atomic_inc_unless_negative(&vma->vm_ref_count)) + vma = NULL; read_unlock(&mm->mm_rb_lock); return vma; diff --git a/mm/mremap.c b/mm/mremap.c index 5a18cec23fa7..b1263e9a16af 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -210,11 +210,39 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, drop_rmap_locks(vma); } +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +static inline bool trylock_vma_ref_count(struct vm_area_struct *vma) +{ + /* + * If we have the only reference, swap the refcount to -1. This + * will prevent other concurrent references by get_vma() for SPFs. + */ + return atomic_cmpxchg(&vma->vm_ref_count, 1, -1) == 1; +} + /* - * Speculative page fault handlers will not detect page table changes done - * without ptl locking. + * Restore the VMA reference count to 1 after a fast mremap. */ -#if defined(CONFIG_HAVE_MOVE_PMD) && !defined(CONFIG_SPECULATIVE_PAGE_FAULT) +static inline void unlock_vma_ref_count(struct vm_area_struct *vma) +{ + /* + * This should only be called after a corresponding, + * successful trylock_vma_ref_count(). + */ + VM_BUG_ON_VMA(atomic_cmpxchg(&vma->vm_ref_count, -1, 1) != -1, + vma); +} +#else /* !CONFIG_SPECULATIVE_PAGE_FAULT */ +static inline bool trylock_vma_ref_count(struct vm_area_struct *vma) +{ + return true; +} +static inline void unlock_vma_ref_count(struct vm_area_struct *vma) +{ +} +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ + +#ifdef CONFIG_HAVE_MOVE_PMD static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd) { @@ -248,6 +276,14 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, if (WARN_ON_ONCE(!pmd_none(*new_pmd))) return false; + /* + * We hold both exclusive mmap_lock and rmap_lock at this point and + * cannot block. If we cannot immediately take exclusive ownership + * of the VMA fallback to the move_ptes(). + */ + if (!trylock_vma_ref_count(vma)) + return false; + /* * We don't have to worry about the ordering of src and dst * ptlocks because exclusive mmap_lock prevents deadlock. @@ -270,6 +306,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, spin_unlock(new_ptl); spin_unlock(old_ptl); + unlock_vma_ref_count(vma); return true; } #else @@ -281,11 +318,7 @@ static inline bool move_normal_pmd(struct vm_area_struct *vma, } #endif -/* - * Speculative page fault handlers will not detect page table changes done - * without ptl locking. - */ -#if defined(CONFIG_HAVE_MOVE_PUD) && !defined(CONFIG_SPECULATIVE_PAGE_FAULT) +#ifdef CONFIG_HAVE_MOVE_PUD static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pud_t *old_pud, pud_t *new_pud) { @@ -300,6 +333,14 @@ static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, if (WARN_ON_ONCE(!pud_none(*new_pud))) return false; + /* + * We hold both exclusive mmap_lock and rmap_lock at this point and + * cannot block. If we cannot immediately take exclusive ownership + * of the VMA fallback to the move_ptes(). + */ + if (!trylock_vma_ref_count(vma)) + return false; + /* * We don't have to worry about the ordering of src and dst * ptlocks because exclusive mmap_lock prevents deadlock. @@ -322,6 +363,7 @@ static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, spin_unlock(new_ptl); spin_unlock(old_ptl); + unlock_vma_ref_count(vma); return true; } #else From 1960d4cfad2d1a1685e84a2056d42c56daee2c16 Mon Sep 17 00:00:00 2001 From: Kever Yang Date: Wed, 21 Dec 2022 23:08:35 +0800 Subject: [PATCH 28/51] ANDROID: GKI: rockchip: Enable symbols for Ethernet Leaf changes summary: 28 artifacts changed Changed leaf types summary: 0 leaf type changed Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 28 Added functions Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable 28 Added functions: [A] 'function void ethtool_convert_legacy_u32_to_link_mode(unsigned long int*, u32)' [A] 'function bool ethtool_convert_link_mode_to_legacy_u32(unsigned int*, const unsigned long int*)' [A] 'function int flow_block_cb_setup_simple(flow_block_offload*, list_head*, flow_setup_cb_t*, void*, void*, bool)' [A] 'function void flow_rule_match_basic(const flow_rule*, flow_match_basic*)' [A] 'function void flow_rule_match_ipv4_addrs(const flow_rule*, flow_match_ipv4_addrs*)' [A] 'function void flow_rule_match_ports(const flow_rule*, flow_match_ports*)' [A] 'function void netdev_rss_key_fill(void*, size_t)' [A] 'function page* page_pool_alloc_pages(page_pool*, gfp_t)' [A] 'function page_pool* page_pool_create(const page_pool_params*)' [A] 'function void page_pool_destroy(page_pool*)' [A] 'function void page_pool_put_page(page_pool*, page*, unsigned int, bool)' [A] 'function void page_pool_release_page(page_pool*, page*)' [A] 'function void phylink_disconnect_phy(phylink*)' [A] 'function int phylink_ethtool_get_eee(phylink*, ethtool_eee*)' [A] 'function void phylink_ethtool_get_pauseparam(phylink*, ethtool_pauseparam*)' [A] 'function void phylink_ethtool_get_wol(phylink*, ethtool_wolinfo*)' [A] 'function int phylink_ethtool_ksettings_get(phylink*, ethtool_link_ksettings*)' [A] 'function int phylink_ethtool_ksettings_set(phylink*, const ethtool_link_ksettings*)' [A] 'function int phylink_ethtool_nway_reset(phylink*)' [A] 'function int phylink_ethtool_set_eee(phylink*, ethtool_eee*)' [A] 'function int phylink_ethtool_set_pauseparam(phylink*, ethtool_pauseparam*)' [A] 'function int phylink_ethtool_set_wol(phylink*, ethtool_wolinfo*)' [A] 'function int phylink_get_eee_err(phylink*)' [A] 'function void phylink_mac_change(phylink*, bool)' [A] 'function int phylink_mii_ioctl(phylink*, ifreq*, int)' [A] 'function int phylink_speed_down(phylink*, bool)' [A] 'function int phylink_speed_up(phylink*)' [A] 'function void phylink_stop(phylink*)' Bug: 239396464 Signed-off-by: Kever Yang Change-Id: I6adf45c9a7159ef07a1913222248128afc0dcccb --- android/abi_gki_aarch64.xml | 711 +++++++++++++++++++++++++++++++ android/abi_gki_aarch64_rockchip | 134 +++++- 2 files changed, 822 insertions(+), 23 deletions(-) diff --git a/android/abi_gki_aarch64.xml b/android/abi_gki_aarch64.xml index 49832c52416e..67f1874ac840 100644 --- a/android/abi_gki_aarch64.xml +++ b/android/abi_gki_aarch64.xml @@ -2499,6 +2499,8 @@ + + @@ -2562,6 +2564,10 @@ + + + + @@ -3667,6 +3673,7 @@ + @@ -3912,6 +3919,11 @@ + + + + + @@ -4144,9 +4156,25 @@ + + + + + + + + + + + + + + + + @@ -7170,6 +7198,7 @@ + @@ -7294,6 +7323,9 @@ + + + @@ -10036,11 +10068,20 @@ + + + + + + + + + @@ -11797,6 +11838,14 @@ + + + + + + + + @@ -11983,6 +12032,17 @@ + + + + + + + + + + + @@ -15162,6 +15222,62 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -19310,6 +19426,32 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -21332,6 +21474,20 @@ + + + + + + + + + + + + + + @@ -21871,6 +22027,14 @@ + + + + + + + + @@ -22301,6 +22465,7 @@ + @@ -23535,6 +23700,7 @@ + @@ -27508,6 +27674,7 @@ + @@ -27535,6 +27702,26 @@ + + + + + + + + + + + + + + + + + + + + @@ -31735,6 +31922,7 @@ + @@ -33688,6 +33876,17 @@ + + + + + + + + + + + @@ -34300,6 +34499,9 @@ + + + @@ -38886,6 +39088,14 @@ + + + + + + + + @@ -39921,6 +40131,17 @@ + + + + + + + + + + + @@ -40490,6 +40711,7 @@ + @@ -40788,6 +41010,14 @@ + + + + + + + + @@ -41378,6 +41608,7 @@ + @@ -41767,6 +41998,7 @@ + @@ -46527,6 +46759,7 @@ + @@ -46639,6 +46872,32 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -48756,6 +49015,7 @@ + @@ -49016,6 +49276,7 @@ + @@ -50662,6 +50923,41 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -51599,6 +51895,14 @@ + + + + + + + + @@ -52087,6 +52391,20 @@ + + + + + + + + + + + + + + @@ -53056,6 +53374,7 @@ + @@ -54164,6 +54483,20 @@ + + + + + + + + + + + + + + @@ -54977,6 +55310,17 @@ + + + + + + + + + + + @@ -59147,6 +59491,7 @@ + @@ -60256,6 +60601,44 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -62717,6 +63100,7 @@ + @@ -63894,6 +64278,7 @@ + @@ -64694,6 +65079,11 @@ + + + + + @@ -65669,6 +66059,7 @@ + @@ -66257,6 +66648,20 @@ + + + + + + + + + + + + + + @@ -66555,6 +66960,17 @@ + + + + + + + + + + + @@ -67200,6 +67616,39 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -71936,6 +72385,14 @@ + + + + + + + + @@ -75007,6 +75464,11 @@ + + + + + @@ -75646,6 +76108,7 @@ + @@ -77713,6 +78176,14 @@ + + + + + + + + @@ -83297,6 +83768,14 @@ + + + + + + + + @@ -88787,6 +89266,15 @@ + + + + + + + + + @@ -91285,6 +91773,7 @@ + @@ -93341,6 +93830,17 @@ + + + + + + + + + + + @@ -93815,6 +94315,7 @@ + @@ -94366,6 +94867,11 @@ + + + + + @@ -94997,6 +95503,29 @@ + + + + + + + + + + + + + + + + + + + + + + + @@ -100058,6 +100587,7 @@ + @@ -101118,6 +101648,23 @@ + + + + + + + + + + + + + + + + + @@ -102500,6 +103047,14 @@ + + + + + + + + @@ -110691,6 +111246,14 @@ + + + + + + + + @@ -111371,6 +111934,14 @@ + + + + + + + + @@ -129997,6 +130568,16 @@ + + + + + + + + + + @@ -130331,7 +130912,31 @@ + + + + + + + + + + + + + + + + + + + + + + + + @@ -135962,6 +136567,11 @@ + + + + + @@ -137236,6 +137846,31 @@ + + + + + + + + + + + + + + + + + + + + + + + + + @@ -138448,6 +139083,69 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -138458,10 +139156,23 @@ + + + + + + + + + + + + + diff --git a/android/abi_gki_aarch64_rockchip b/android/abi_gki_aarch64_rockchip index 6489a344910d..12d5af00e945 100644 --- a/android/abi_gki_aarch64_rockchip +++ b/android/abi_gki_aarch64_rockchip @@ -22,6 +22,8 @@ bdget_disk bdput _bin2bcd + __bitmap_and + __bitmap_andnot blk_cleanup_queue blk_execute_rq_nowait blk_mq_free_request @@ -174,6 +176,7 @@ debugfs_create_file debugfs_create_regset32 debugfs_remove + debugfs_rename default_llseek delayed_work_timer_fn del_gendisk @@ -200,6 +203,7 @@ device_del device_destroy device_get_child_node_count + device_get_match_data device_get_named_child_node device_get_next_child_node device_initialize @@ -285,6 +289,7 @@ devm_snd_soc_register_component devm_usb_get_phy _dev_notice + dev_open dev_pm_domain_detach dev_pm_opp_find_freq_ceil dev_pm_opp_find_freq_floor @@ -469,6 +474,8 @@ enable_irq eth_mac_addr eth_platform_get_mac_address + ethtool_op_get_link + ethtool_op_get_ts_info eth_type_trans eth_validate_addr event_triggers_call @@ -480,6 +487,7 @@ extcon_set_state_sync extcon_unregister_notifier failure_tracking + fasync_helper fd_install find_next_bit find_next_zero_bit @@ -584,6 +592,9 @@ i2c_smbus_xfer i2c_transfer i2c_transfer_buffer_flags + ida_alloc_range + ida_destroy + ida_free idr_alloc idr_destroy idr_find @@ -661,6 +672,7 @@ kfree_const kfree_sensitive kfree_skb + kill_fasync kimage_voffset __kmalloc kmalloc_caches @@ -682,6 +694,10 @@ kstrtouint_from_user kstrtoull kthread_create_on_node + kthread_create_worker + kthread_destroy_worker + kthread_flush_worker + kthread_queue_work kthread_should_stop kthread_stop ktime_get @@ -703,7 +719,10 @@ __log_read_mmio __log_write_mmio lzo1x_decompress_safe + mdiobus_alloc_size + mdiobus_free mdiobus_read + mdiobus_unregister mdiobus_write media_create_pad_link media_device_init @@ -753,10 +772,14 @@ mutex_lock_interruptible mutex_trylock mutex_unlock + napi_gro_receive __netdev_alloc_skb netdev_err netdev_info + netdev_update_features netdev_warn + netif_carrier_off + netif_carrier_on netif_rx netif_rx_ni netif_tx_wake_queue @@ -850,6 +873,7 @@ perf_trace_buf_alloc perf_trace_run_bpf_submit pfn_valid + phy_attached_info phy_configure phy_drivers_register phy_drivers_unregister @@ -965,6 +989,7 @@ __register_chrdev register_chrdev_region register_inetaddr_notifier + register_netdev register_netdevice register_netdevice_notifier register_pm_notifier @@ -1069,6 +1094,7 @@ simple_strtoul single_open single_release + skb_add_rx_frag skb_clone skb_copy skb_copy_bits @@ -1126,6 +1152,7 @@ sscanf __stack_chk_fail __stack_chk_guard + strcasecmp strchr strcmp strcpy @@ -1147,6 +1174,7 @@ sync_file_create sync_file_get_fence synchronize_irq + synchronize_net synchronize_rcu syscon_node_to_regmap syscon_regmap_lookup_by_phandle @@ -1194,6 +1222,7 @@ __unregister_chrdev unregister_chrdev_region unregister_inetaddr_notifier + unregister_netdev unregister_netdevice_notifier unregister_netdevice_queue unregister_reboot_notifier @@ -1389,7 +1418,6 @@ # required by bcmdhd.ko alloc_etherdev_mqs complete_and_exit - dev_open down_interruptible down_timeout iwe_stream_add_event @@ -1400,7 +1428,6 @@ mmc_set_data_timeout mmc_sw_reset mmc_wait_for_req - netdev_update_features netif_napi_add __netif_napi_del netif_set_xps_queue @@ -1411,7 +1438,6 @@ __nlmsg_put _raw_read_lock_bh _raw_read_unlock_bh - register_netdev sched_set_fifo_low sdio_claim_host sdio_disable_func @@ -1445,13 +1471,11 @@ strcat strspn sys_tz - unregister_netdev unregister_pm_notifier wireless_send_event # required by bifrost_kbase.ko __arch_clear_user - __bitmap_andnot __bitmap_equal __bitmap_or __bitmap_weight @@ -1537,7 +1561,6 @@ # required by cfg80211.ko bpf_trace_run10 _ctype - debugfs_rename dev_change_net_namespace __dev_get_by_index dev_get_by_index @@ -1570,7 +1593,6 @@ rfkill_blocked rfkill_pause_polling rfkill_resume_polling - skb_add_rx_frag __sock_create sock_release unregister_pernet_device @@ -1759,6 +1781,11 @@ usb_speed_string usb_wakeup_enabled_descendants +# required by dwmac-rockchip.ko + csum_tcpudp_nofold + ip_send_check + of_get_phy_mode + # required by fusb302.ko extcon_get_extcon_dev fwnode_create_software_node @@ -1881,7 +1908,6 @@ dev_fetch_sw_netstats dev_queue_xmit ether_setup - ethtool_op_get_link get_random_u32 __hw_addr_init __hw_addr_sync @@ -1890,10 +1916,7 @@ kernel_param_unlock kfree_skb_list ktime_get_seconds - napi_gro_receive netdev_set_default_ethtool_ops - netif_carrier_off - netif_carrier_on netif_receive_skb netif_receive_skb_list netif_tx_stop_all_queues @@ -1919,7 +1942,6 @@ skb_queue_head skb_queue_purge skb_queue_tail - synchronize_net unregister_inet6addr_notifier unregister_netdevice_many @@ -1956,9 +1978,6 @@ dev_pm_qos_expose_latency_tolerance dev_pm_qos_hide_latency_tolerance dev_pm_qos_update_user_latency_tolerance - ida_alloc_range - ida_destroy - ida_free init_srcu_struct memchr_inv param_ops_ulong @@ -2107,7 +2126,6 @@ extcon_sync # required by phy-rockchip-inno-usb3.ko - strcasecmp usb_add_phy # required by phy-rockchip-samsung-hdptx-hdmi.ko @@ -2159,6 +2177,18 @@ pm_genpd_init pm_genpd_remove +# required by pps_core.ko + kobject_get + +# required by ptp.ko + kthread_cancel_delayed_work_sync + kthread_delayed_work_timer_fn + kthread_mod_delayed_work + kthread_queue_delayed_work + ktime_get_snapshot + posix_clock_register + posix_clock_unregister + # required by pwm-regulator.ko regulator_map_voltage_iterate @@ -2562,10 +2592,8 @@ blk_verify_command cdev_alloc class_interface_unregister - fasync_helper get_sg_io_hdr import_iovec - kill_fasync put_sg_io_hdr _raw_read_lock_irqsave _raw_read_unlock_irqrestore @@ -2593,12 +2621,7 @@ # required by smsc95xx.ko csum_partial - ethtool_op_get_ts_info - mdiobus_alloc_size - mdiobus_free __mdiobus_register - mdiobus_unregister - phy_attached_info phy_connect_direct phy_disconnect phy_ethtool_get_link_ksettings @@ -2677,6 +2700,71 @@ spi_setup stream_open +# required by stmmac-platform.ko + device_get_phy_mode + of_get_mac_address + of_phy_is_fixed_link + platform_get_irq_byname_optional + +# required by stmmac.ko + devm_alloc_etherdev_mqs + dql_completed + dql_reset + ethtool_convert_legacy_u32_to_link_mode + ethtool_convert_link_mode_to_legacy_u32 + flow_block_cb_setup_simple + flow_rule_match_basic + flow_rule_match_ipv4_addrs + flow_rule_match_ports + mdiobus_get_phy + __napi_alloc_skb + napi_complete_done + napi_disable + __napi_schedule + napi_schedule_prep + netdev_alert + netdev_pick_tx + netdev_rss_key_fill + netif_device_attach + netif_device_detach + netif_napi_add + __netif_napi_del + netif_schedule_queue + netif_set_real_num_rx_queues + netif_set_real_num_tx_queues + of_mdiobus_register + page_pool_alloc_pages + page_pool_create + page_pool_destroy + page_pool_put_page + page_pool_release_page + phy_init_eee + phylink_connect_phy + phylink_create + phylink_destroy + phylink_disconnect_phy + phylink_ethtool_get_eee + phylink_ethtool_get_pauseparam + phylink_ethtool_get_wol + phylink_ethtool_ksettings_get + phylink_ethtool_ksettings_set + phylink_ethtool_nway_reset + phylink_ethtool_set_eee + phylink_ethtool_set_pauseparam + phylink_ethtool_set_wol + phylink_get_eee_err + phylink_mac_change + phylink_mii_ioctl + phylink_of_phy_connect + phylink_set_port_modes + phylink_speed_down + phylink_speed_up + phylink_start + phylink_stop + pm_wakeup_dev_event + reset_control_reset + skb_tstamp_tx + # required by sw_sync.ko dma_fence_free dma_fence_signal_locked From 91e760f1f24b5dbc72e570863a42b6fc20f3a60f Mon Sep 17 00:00:00 2001 From: Kever Yang Date: Wed, 21 Dec 2022 23:34:03 +0800 Subject: [PATCH 29/51] ANDROID: GKI: rockchip: Enable symbols for HDMIRX Leaf changes summary: 1 artifact changed Changed leaf types summary: 0 leaf type changed Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 1 Added function Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable 1 Added function: [A] 'function bool v4l2_find_dv_timings_cap(v4l2_dv_timings*, const v4l2_dv_timings_cap*, unsigned int, v4l2_check_dv_timings_fnc*, void*)' Bug: 239396464 Signed-off-by: Kever Yang Change-Id: I45009e2791f99b65476daafaff343df33af72433 --- android/abi_gki_aarch64.xml | 9 +++++++++ android/abi_gki_aarch64_rockchip | 19 +++++++++++++++---- 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/android/abi_gki_aarch64.xml b/android/abi_gki_aarch64.xml index 67f1874ac840..bda4150b97dd 100644 --- a/android/abi_gki_aarch64.xml +++ b/android/abi_gki_aarch64.xml @@ -5954,6 +5954,7 @@ + @@ -148267,6 +148268,14 @@ + + + + + + + + diff --git a/android/abi_gki_aarch64_rockchip b/android/abi_gki_aarch64_rockchip index 12d5af00e945..ad2a3046e510 100644 --- a/android/abi_gki_aarch64_rockchip +++ b/android/abi_gki_aarch64_rockchip @@ -122,6 +122,7 @@ __const_udelay consume_skb cpu_bit_bitmap + cpufreq_cpu_get __cpufreq_driver_target cpufreq_generic_suspend cpufreq_register_governor @@ -219,6 +220,7 @@ device_remove_file device_set_wakeup_capable device_set_wakeup_enable + device_unregister device_wakeup_enable _dev_info __dev_kfree_skb_any @@ -1187,6 +1189,7 @@ sysfs_remove_link sysfs_streq system_freezable_wq + system_highpri_wq system_long_wq system_power_efficient_wq system_state @@ -1528,7 +1531,6 @@ simple_open strcspn system_freezing_cnt - system_highpri_wq _totalram_pages __traceiter_gpu_mem_total trace_output_call @@ -2069,7 +2071,6 @@ __arm_smccc_hvc bus_for_each_dev device_register - device_unregister free_pages_exact memremap memunmap @@ -2320,6 +2321,18 @@ dev_pm_opp_put_prop_name dev_pm_opp_set_supported_hw +# required by rockchip-hdmirx.ko + cec_s_phys_addr_from_edid + cpu_latency_qos_remove_request + device_create_with_groups + of_reserved_mem_device_release + v4l2_ctrl_log_status + v4l2_ctrl_subscribe_event + v4l2_find_dv_timings_cap + v4l2_src_change_event_subscribe + vb2_dma_contig_memops + vb2_fop_read + # required by rockchip-rng.ko devm_hwrng_register devm_of_iomap @@ -2328,7 +2341,6 @@ cpu_topology # required by rockchip_dmc.ko - cpufreq_cpu_get cpufreq_cpu_put cpufreq_quick_get devfreq_event_disable_edev @@ -2574,7 +2586,6 @@ sdhci_setup_host # required by sdhci-of-dwcmshc.ko - device_get_match_data devm_clk_bulk_get_optional dma_get_required_mask sdhci_adma_write_desc From b460d3c09a7ba6f020b532a3a21acb58acebc315 Mon Sep 17 00:00:00 2001 From: Kever Yang Date: Thu, 22 Dec 2022 11:03:08 +0800 Subject: [PATCH 30/51] ANDROID: GKI: rockchip: Update module fragment and symbol list Add fragment need by rockchip platform and sync the symbol list to the latest source code. This patch does not add or remove symbol from xml file. Bug: 239396464 Signed-off-by: Kever Yang Change-Id: I54e37e865124cbc7f70646481ca798e27fcc4706 --- android/abi_gki_aarch64_rockchip | 132 +++++++++++++---------- arch/arm64/configs/rockchip_gki.fragment | 12 ++- 2 files changed, 84 insertions(+), 60 deletions(-) diff --git a/android/abi_gki_aarch64_rockchip b/android/abi_gki_aarch64_rockchip index ad2a3046e510..ec0b9c854e41 100644 --- a/android/abi_gki_aarch64_rockchip +++ b/android/abi_gki_aarch64_rockchip @@ -82,10 +82,13 @@ __cfi_slowpath __check_object_size __class_create + class_create_file_ns class_destroy class_for_each_device __class_register + class_remove_file_ns class_unregister + __ClearPageMovable clk_bulk_disable clk_bulk_enable clk_bulk_prepare @@ -171,6 +174,7 @@ crypto_unregister_shash crypto_unregister_template __crypto_xor + _ctype debugfs_attr_read debugfs_attr_write debugfs_create_dir @@ -183,7 +187,6 @@ del_gendisk del_timer del_timer_sync - desc_to_gpio destroy_workqueue dev_close dev_driver_string @@ -192,6 +195,7 @@ devfreq_add_governor devfreq_recommended_opp devfreq_register_opp_notifier + devfreq_remove_governor devfreq_resume_device devfreq_suspend_device devfreq_unregister_opp_notifier @@ -234,6 +238,7 @@ devm_devfreq_add_device devm_devfreq_event_add_edev devm_devfreq_register_opp_notifier + devm_device_add_group devm_extcon_dev_allocate devm_extcon_dev_register devm_free_irq @@ -306,6 +311,7 @@ dev_pm_opp_register_set_opp_helper dev_pm_opp_set_rate dev_pm_opp_set_regulators + dev_pm_opp_set_supported_hw dev_pm_opp_unregister_set_opp_helper dev_printk devres_add @@ -368,8 +374,8 @@ driver_register driver_unregister drm_add_edid_modes + drm_add_modes_noedid drm_atomic_get_crtc_state - drm_atomic_get_new_bridge_state drm_atomic_get_new_connector_for_encoder drm_atomic_helper_bridge_destroy_state drm_atomic_helper_bridge_duplicate_state @@ -387,8 +393,12 @@ drm_compat_ioctl drm_connector_attach_encoder drm_connector_cleanup + drm_connector_has_possible_encoder drm_connector_init drm_connector_init_with_ddc + drm_connector_list_iter_begin + drm_connector_list_iter_end + drm_connector_list_iter_next drm_connector_unregister drm_connector_update_edid_property __drm_dbg @@ -406,7 +416,9 @@ drm_dp_aux_register drm_dp_aux_unregister drm_dp_bw_code_to_link_rate + drm_dp_channel_eq_ok drm_dp_dpcd_read + drm_dp_dpcd_read_link_status drm_dp_dpcd_write drm_dp_get_phy_test_pattern drm_dp_link_rate_to_bw_code @@ -550,6 +562,7 @@ gpiod_get_value_cansleep gpiod_set_consumer_name gpiod_set_raw_value + gpiod_set_raw_value_cansleep gpiod_set_value gpiod_set_value_cansleep gpiod_to_irq @@ -579,6 +592,7 @@ hrtimer_start_range_ns i2c_adapter_type i2c_add_adapter + i2c_add_numbered_adapter i2c_del_adapter i2c_del_driver i2c_get_adapter @@ -651,7 +665,6 @@ irq_find_mapping irq_get_irq_data irq_modify_status - irq_of_parse_and_map irq_set_affinity_hint irq_set_chained_handler_and_data irq_set_chip @@ -659,6 +672,7 @@ irq_set_chip_data irq_set_irq_type irq_set_irq_wake + irq_to_desc is_vmalloc_addr jiffies jiffies_to_msecs @@ -717,6 +731,7 @@ __list_add_valid __list_del_entry_valid __local_bh_enable_ip + __lock_page __log_post_read_mmio __log_read_mmio __log_write_mmio @@ -739,7 +754,6 @@ media_pipeline_start media_pipeline_stop memcpy - __memcpy_fromio memdup_user memmove memset @@ -758,7 +772,9 @@ mipi_dsi_host_unregister misc_deregister misc_register + mmc_cqe_request_done mmc_of_parse + mmc_request_done __mmdrop mod_delayed_work_on mod_timer @@ -814,6 +830,7 @@ of_device_is_available of_device_is_compatible of_drm_find_bridge + of_drm_find_panel of_find_compatible_node of_find_device_by_node of_find_i2c_device_by_node @@ -832,6 +849,7 @@ of_get_parent of_get_property of_get_regulator_init_data + of_graph_get_endpoint_by_regs of_graph_get_next_endpoint of_graph_get_remote_node of_graph_get_remote_port_parent @@ -914,6 +932,7 @@ platform_driver_unregister platform_get_irq platform_get_irq_byname + platform_get_irq_optional platform_get_resource platform_get_resource_byname platform_irq_count @@ -940,6 +959,7 @@ power_supply_changed power_supply_class power_supply_get_battery_info + power_supply_get_by_name power_supply_get_by_phandle power_supply_get_drvdata power_supply_get_property @@ -963,6 +983,7 @@ put_disk __put_page __put_task_struct + put_unused_fd pwm_adjust_config pwm_apply_state queue_delayed_work_on @@ -973,6 +994,7 @@ _raw_spin_lock_bh _raw_spin_lock_irq _raw_spin_lock_irqsave + _raw_spin_trylock _raw_spin_unlock _raw_spin_unlock_bh _raw_spin_unlock_irq @@ -1080,6 +1102,7 @@ seq_puts seq_read set_page_dirty_lock + __SetPageMovable sg_alloc_table sg_alloc_table_from_pages sg_free_table @@ -1113,6 +1136,7 @@ skcipher_walk_virt snd_pcm_format_width snd_soc_add_component_controls + snd_soc_add_dai_controls snd_soc_card_jack_new snd_soc_component_read snd_soc_component_set_jack @@ -1145,6 +1169,7 @@ snd_soc_pm_ops snd_soc_put_enum_double snd_soc_put_volsw + snd_soc_register_component snd_soc_unregister_component snprintf sort @@ -1222,6 +1247,7 @@ typec_switch_register typec_switch_unregister __udelay + unlock_page __unregister_chrdev unregister_chrdev_region unregister_inetaddr_notifier @@ -1431,8 +1457,6 @@ mmc_set_data_timeout mmc_sw_reset mmc_wait_for_req - netif_napi_add - __netif_napi_del netif_set_xps_queue __netlink_kernel_create netlink_kernel_release @@ -1482,6 +1506,7 @@ __bitmap_equal __bitmap_or __bitmap_weight + __bitmap_xor cache_line_size clear_page complete_all @@ -1542,9 +1567,6 @@ vmalloc_user vmf_insert_pfn_prot -# required by bq25700_charger.ko - power_supply_get_by_name - # required by cdc-wdm.ko cdc_parse_cdc_header @@ -1562,7 +1584,6 @@ # required by cfg80211.ko bpf_trace_run10 - _ctype dev_change_net_namespace __dev_get_by_index dev_get_by_index @@ -1652,11 +1673,6 @@ # required by cm3218.ko i2c_smbus_write_word_data -# required by cma_heap.ko - cma_get_name - dma_heap_get_drvdata - dma_heap_put - # required by cpufreq-dt.ko cpufreq_enable_boost_support cpufreq_freq_attr_scaling_available_freqs @@ -1691,7 +1707,6 @@ # required by cqhci.ko devm_blk_ksm_init - mmc_cqe_request_done # required by cryptodev.ko crypto_ahash_final @@ -1702,7 +1717,11 @@ sg_last unregister_sysctl_table +# required by cw221x_battery.ko + power_supply_is_system_supplied + # required by display-connector.ko + drm_atomic_get_new_bridge_state drm_probe_ddc # required by dm9601.ko @@ -1720,7 +1739,6 @@ # required by dw-hdmi.ko drm_connector_attach_max_bpc_property drm_default_rgb_quant_range - of_graph_get_endpoint_by_regs # required by dw-mipi-dsi.ko drm_panel_bridge_add_typed @@ -1746,14 +1764,12 @@ mmc_regulator_set_ocr mmc_regulator_set_vqmmc mmc_remove_host - mmc_request_done sdio_signal_irq sg_miter_next sg_miter_start sg_miter_stop # required by dw_wdt.ko - platform_get_irq_optional watchdog_init_timeout watchdog_register_device watchdog_set_restart_priority @@ -1764,7 +1780,6 @@ bitmap_find_next_zero_area_off __bitmap_set phy_reset - _raw_spin_trylock usb_add_gadget_udc usb_del_gadget_udc usb_ep_set_maxpacket_limit @@ -1791,6 +1806,7 @@ # required by fusb302.ko extcon_get_extcon_dev fwnode_create_software_node + sched_set_fifo tcpm_cc_change tcpm_pd_hard_reset tcpm_pd_receive @@ -1842,6 +1858,7 @@ i2c_verify_client # required by i2c-gpio.ko + desc_to_gpio i2c_bit_add_numbered_bus # required by i2c-hid.ko @@ -1852,7 +1869,6 @@ hid_parse_report # required by i2c-mux.ko - i2c_add_numbered_adapter __i2c_transfer rt_mutex_lock rt_mutex_trylock @@ -2097,6 +2113,7 @@ # required by pcie-dw-rockchip.ko cpumask_next_and + debugfs_create_devm_seqfile dw_pcie_find_ext_capability dw_pcie_host_init dw_pcie_link_up @@ -2173,10 +2190,13 @@ clk_bulk_put of_genpd_add_provider_onecell panic + param_get_bool + param_set_bool pm_clk_add_clk pm_genpd_add_subdomain pm_genpd_init pm_genpd_remove + pm_wq # required by pps_core.ko kobject_get @@ -2205,6 +2225,13 @@ pwm_free pwm_request +# required by pwrseq_simple.ko + bitmap_alloc + devm_gpiod_get_array + gpiod_set_array_value_cansleep + mmc_pwrseq_register + mmc_pwrseq_unregister + # required by reboot-mode.ko devres_release kernel_kobj @@ -2218,6 +2245,7 @@ alloc_iova_fast dma_fence_wait_timeout free_iova_fast + idr_alloc_cyclic kstrdup_quotable_cmdline mmput @@ -2226,9 +2254,6 @@ irq_domain_xlate_onetwocell irq_set_parent -# required by rk628_dsi.ko - of_drm_find_panel - # required by rk805-pwrkey.ko devm_request_any_context_irq @@ -2253,6 +2278,10 @@ # required by rk860x-regulator.ko regulator_suspend_enable +# required by rk_cma_heap.ko + dma_heap_get_drvdata + dma_heap_put + # required by rk_crypto.ko crypto_ahash_digest crypto_dequeue_request @@ -2281,8 +2310,15 @@ # required by rk_ircut.ko drain_workqueue +# required by rk_system_heap.ko + deferred_free + dmabuf_page_pool_alloc + dmabuf_page_pool_create + dmabuf_page_pool_destroy + dmabuf_page_pool_free + swiotlb_max_segment + # required by rk_vcodec.ko - devfreq_remove_governor devm_iounmap dev_pm_domain_attach dev_pm_opp_get_freq @@ -2292,9 +2328,7 @@ __fdget iommu_device_unregister iommu_dma_reserve_iova - kthread_flush_worker __kthread_init_worker - kthread_queue_work kthread_worker_fn of_device_alloc of_dma_configure_id @@ -2319,7 +2353,6 @@ # required by rockchip-cpufreq.ko cpufreq_unregister_notifier dev_pm_opp_put_prop_name - dev_pm_opp_set_supported_hw # required by rockchip-hdmirx.ko cec_s_phys_addr_from_edid @@ -2340,6 +2373,9 @@ # required by rockchip_bus.ko cpu_topology +# required by rockchip_debug.ko + nr_irqs + # required by rockchip_dmc.ko cpufreq_cpu_put cpufreq_quick_get @@ -2373,7 +2409,6 @@ regulator_get_linear_step # required by rockchip_pwm_remotectl.ko - irq_to_desc __tasklet_hi_schedule # required by rockchip_saradc.ko @@ -2407,7 +2442,6 @@ component_unbind_all devm_of_phy_get_by_index driver_find_device - drm_add_modes_noedid drm_atomic_commit drm_atomic_get_connector_state drm_atomic_get_plane_state @@ -2441,11 +2475,8 @@ drm_atomic_set_mode_for_crtc drm_atomic_state_alloc __drm_atomic_state_free + drm_bridge_chain_mode_set drm_bridge_get_edid - drm_connector_has_possible_encoder - drm_connector_list_iter_begin - drm_connector_list_iter_end - drm_connector_list_iter_next drm_connector_list_update drm_crtc_cleanup drm_crtc_enable_color_mgmt @@ -2459,9 +2490,7 @@ drm_crtc_vblank_put drm_debugfs_create_files drm_do_get_edid - drm_dp_channel_eq_ok drm_dp_clock_recovery_ok - drm_dp_dpcd_read_link_status drm_dp_get_adjust_request_pre_emphasis drm_dp_get_adjust_request_voltage drm_dp_read_desc @@ -2592,10 +2621,6 @@ sdhci_remove_host sdhci_request -# required by sensor_dev.ko - class_create_file_ns - class_remove_file_ns - # required by sg.ko blk_get_request blk_put_request @@ -2650,9 +2675,6 @@ # required by snd-soc-es8316.ko snd_pcm_hw_constraint_list -# required by snd-soc-es8326.ko - snd_soc_register_component - # required by snd-soc-hdmi-codec.ko snd_ctl_add snd_ctl_new1 @@ -2661,7 +2683,7 @@ snd_pcm_fill_iec958_consumer snd_pcm_fill_iec958_consumer_hw_params snd_pcm_hw_constraint_eld - snd_pcm_stop_xrun + snd_pcm_stop # required by snd-soc-rk817.ko snd_soc_component_exit_regmap @@ -2672,7 +2694,8 @@ # required by snd-soc-rockchip-i2s-tdm.ko clk_is_match - snd_soc_add_dai_controls + pm_runtime_forbid + snd_pcm_stop_xrun # required by snd-soc-rockchip-i2s.ko of_prop_next_string @@ -2682,8 +2705,10 @@ snd_soc_jack_add_zones snd_soc_jack_get_type +# required by snd-soc-rockchip-spdif.ko + snd_pcm_create_iec958_consumer_hw_params + # required by snd-soc-rt5640.ko - gpiod_set_raw_value_cansleep regmap_register_patch snd_soc_dapm_force_bias_level @@ -2780,15 +2805,6 @@ dma_fence_free dma_fence_signal_locked __get_task_comm - put_unused_fd - -# required by system_heap.ko - deferred_free - dmabuf_page_pool_alloc - dmabuf_page_pool_create - dmabuf_page_pool_destroy - dmabuf_page_pool_free - swiotlb_max_segment # required by tcpci_husb311.ko tcpci_get_tcpm_port @@ -2812,6 +2828,7 @@ # required by timer-rockchip.ko clockevents_config_and_register + irq_of_parse_and_map # required by tps65132-regulator.ko regulator_set_active_discharge_regmap @@ -2892,6 +2909,7 @@ # required by video_rkisp.ko media_device_cleanup + __memcpy_fromio __memcpy_toio param_ops_ullong v4l2_pipeline_link_notify @@ -2931,7 +2949,6 @@ # required by zsmalloc.ko alloc_anon_inode - __ClearPageMovable contig_page_data dec_zone_page_state inc_zone_page_state @@ -2940,11 +2957,8 @@ kern_mount kern_unmount kill_anon_super - __lock_page page_mapping _raw_read_lock _raw_read_unlock _raw_write_lock _raw_write_unlock - __SetPageMovable - unlock_page diff --git a/arch/arm64/configs/rockchip_gki.fragment b/arch/arm64/configs/rockchip_gki.fragment index 6253108101a8..1eb35920205b 100644 --- a/arch/arm64/configs/rockchip_gki.fragment +++ b/arch/arm64/configs/rockchip_gki.fragment @@ -50,10 +50,16 @@ CONFIG_DRM_PANEL_SIMPLE=m CONFIG_DRM_RK1000_TVE=m CONFIG_DRM_RK630_TVE=m CONFIG_DRM_ROCKCHIP=m +CONFIG_DRM_ROCKCHIP_RK618=m CONFIG_DRM_ROCKCHIP_RK628=m CONFIG_DRM_ROHM_BU18XL82=m CONFIG_DRM_SII902X=m CONFIG_DTC_SYMBOLS=y +# CONFIG_DWMAC_GENERIC is not set +# CONFIG_DWMAC_IPQ806X is not set +# CONFIG_DWMAC_QCOM_ETHQOS is not set +# CONFIG_DWMAC_SUN8I is not set +# CONFIG_DWMAC_SUNXI is not set CONFIG_DW_WATCHDOG=m CONFIG_GPIO_ROCKCHIP=m CONFIG_GREENASIA_FF=y @@ -146,6 +152,7 @@ CONFIG_MALI_BIFROST_EXPERT=y CONFIG_MALI_CSF_SUPPORT=y CONFIG_MALI_PLATFORM_NAME="rk" CONFIG_MALI_PWRSOFT_765=y +CONFIG_MFD_RK618=m CONFIG_MFD_RK628=m CONFIG_MFD_RK630_I2C=m CONFIG_MFD_RK806_SPI=m @@ -186,6 +193,7 @@ CONFIG_PROXIMITY_DEVICE=m CONFIG_PS_STK3410=m CONFIG_PS_UCS14620=m CONFIG_PWM_ROCKCHIP=m +CONFIG_PWRSEQ_SIMPLE=m CONFIG_REGULATOR_ACT8865=m CONFIG_REGULATOR_FAN53555=m CONFIG_REGULATOR_GPIO=m @@ -236,6 +244,7 @@ CONFIG_ROCKCHIP_PM_DOMAINS=m CONFIG_ROCKCHIP_PVTM=m CONFIG_ROCKCHIP_REMOTECTL=m CONFIG_ROCKCHIP_REMOTECTL_PWM=m +CONFIG_ROCKCHIP_MULTI_RGA=m CONFIG_ROCKCHIP_RGB=y CONFIG_ROCKCHIP_RKNPU=m CONFIG_ROCKCHIP_SARADC=m @@ -278,6 +287,7 @@ CONFIG_SND_SOC_RT5640=m CONFIG_SND_SOC_SPDIF=m CONFIG_SPI_ROCKCHIP=m CONFIG_SPI_SPIDEV=m +CONFIG_STMMAC_ETH=m CONFIG_SW_SYNC=m CONFIG_SYSCON_REBOOT_MODE=m CONFIG_TEE=m @@ -328,8 +338,8 @@ CONFIG_VIDEO_RK628_BT1120=m CONFIG_VIDEO_RK628_CSI=m CONFIG_VIDEO_RK_IRCUT=m CONFIG_VIDEO_ROCKCHIP_CIF=m +CONFIG_VIDEO_ROCKCHIP_HDMIRX=m CONFIG_VIDEO_ROCKCHIP_ISP=m -CONFIG_VIDEO_ROCKCHIP_ISPP=m CONFIG_VIDEO_S5K3L6XX=m CONFIG_VIDEO_S5KJN1=m CONFIG_VIDEO_SGM3784=m From 91e4675508a2633a8c9d178912f40fd9b0adb299 Mon Sep 17 00:00:00 2001 From: Eric Biggers Date: Thu, 5 Jan 2023 22:29:52 +0000 Subject: [PATCH 31/51] ANDROID: fips140: add crypto_memneq() back to the module crypto_memneq() is one of the utility functions that was intentionally included in the fips140 module, out of concerns that it would be seen as "cryptographic" and thus would be required to be included the module for the FIPS certification. It should not have been removed from the module, so add it back. Bug: 188620248 Fixes: 18cd39b70602 ("Merge tag 'android12-5.10.136_r00' into android12-5.10") Change-Id: I8a19dfd73390f8c1348885f97fa42d900e47b82b Signed-off-by: Eric Biggers --- crypto/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crypto/Makefile b/crypto/Makefile index 2785e5fab9e7..8be22a6ea3d8 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -214,7 +214,7 @@ crypto-fips-objs := drbg.o ecb.o cbc.o ctr.o cts.o gcm.o xts.o hmac.o cmac.o \ gf128mul.o aes_generic.o lib-crypto-aes.o \ jitterentropy.o jitterentropy-kcapi.o \ sha1_generic.o sha256_generic.o sha512_generic.o \ - lib-sha1.o lib-crypto-sha256.o + lib-memneq.o lib-sha1.o lib-crypto-sha256.o crypto-fips-objs := $(foreach o,$(crypto-fips-objs),$(o:.o=-fips.o)) # get the arch to add its objects to $(crypto-fips-objs) From 447ba7ae757cd8ff9900875586396d64030933c1 Mon Sep 17 00:00:00 2001 From: Wu Bo Date: Mon, 9 Jan 2023 16:15:31 +0800 Subject: [PATCH 32/51] ANDROID: GKI: VIVO: Add a symbol to symbol list Add 'dentry_path_raw' symbol to support some monitoring tools. This patch does not add or remove symbol from xml file. Bug: 264831214 Change-Id: I2b5aaa2945c5fd0ebe4062915b53407251a6ab77 Signed-off-by: Wu Bo --- android/abi_gki_aarch64_vivo | 1 + 1 file changed, 1 insertion(+) diff --git a/android/abi_gki_aarch64_vivo b/android/abi_gki_aarch64_vivo index c3bc87e17c9f..28c4b05aa43f 100644 --- a/android/abi_gki_aarch64_vivo +++ b/android/abi_gki_aarch64_vivo @@ -277,6 +277,7 @@ del_gendisk del_timer del_timer_sync + dentry_path_raw desc_to_gpio destroy_workqueue dev_coredumpv From 824c55581dde8710103b10f8968050f2cc9bf3fb Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 5 Sep 2021 11:24:05 -0700 Subject: [PATCH 33/51] UPSTREAM: Enable '-Werror' by default for all kernel builds ... but make it a config option so that broken environments can disable it when required. We really should always have a clean build, and will disable specific over-eager warnings as required, if we can't fix them. But while I fairly religiously enforce that in my own tree, it doesn't get enforced by various build robots that don't necessarily report warnings. So this just makes '-Werror' a default compiler flag, but allows people to disable it for their configuration if they have some particular issues. Occasionally, new compiler versions end up enabling new warnings, and it can take a while before we have them fixed (or the warnings disabled if that is what it takes), so the config option allows for that situation. Hopefully this will mean that I get fewer pull requests that have new warnings that were not noticed by various automation we have in place. Knock wood. Signed-off-by: Linus Torvalds (cherry picked from commit 3fe617ccafd6f5bb33c2391d6f4eeb41c1fd0151) Signed-off-by: Alistair Delva Change-Id: If981f26ebe668be7c727661fede10215c4ee5bc5 --- Makefile | 3 +++ init/Kconfig | 14 ++++++++++++++ 2 files changed, 17 insertions(+) diff --git a/Makefile b/Makefile index a413e6c0c15c..e8b8d5894f9e 100644 --- a/Makefile +++ b/Makefile @@ -793,6 +793,9 @@ stackp-flags-$(CONFIG_STACKPROTECTOR_STRONG) := -fstack-protector-strong KBUILD_CFLAGS += $(stackp-flags-y) +KBUILD_CFLAGS-$(CONFIG_WERROR) += -Werror +KBUILD_CFLAGS += $(KBUILD_CFLAGS-y) + ifdef CONFIG_CC_IS_CLANG KBUILD_CPPFLAGS += -Qunused-arguments KBUILD_CFLAGS += -Wno-format-invalid-specifier diff --git a/init/Kconfig b/init/Kconfig index 6c119a4010ee..f7b680e39f65 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -131,6 +131,20 @@ config COMPILE_TEST here. If you are a user/distributor, say N here to exclude useless drivers to be distributed. +config WERROR + bool "Compile the kernel with warnings as errors" + default y + help + A kernel build should not cause any compiler warnings, and this + enables the '-Werror' flag to enforce that rule by default. + + However, if you have a new (or very old) compiler with odd and + unusual warnings, or you have some architecture with problems, + you may need to disable this config option in order to + successfully build the kernel. + + If in doubt, say Y. + config UAPI_HEADER_TEST bool "Compile test UAPI headers" depends on HEADERS_INSTALL && CC_CAN_LINK From 7c31ae524c45378af224f52d47caaa117a582490 Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Wed, 8 Dec 2021 14:13:22 +0100 Subject: [PATCH 34/51] ANDROID: allmodconfig: disable WERROR -Werror still fails on some arm and arm64 code due to clang issues (works on gcc!), so disable it when building allmodconfig builds for now. Hopefully the clang developers will work on this... Bug: 199872592 Signed-off-by: Greg Kroah-Hartman Change-Id: I6ccc856773c40e3c0f541a1316b20e9ae3de4380 (cherry picked from commit eb57c31115051c5404d1bb1f2daec20e051b0287) Signed-off-by: Alistair Delva --- build.config.allmodconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/build.config.allmodconfig b/build.config.allmodconfig index a48f8d420208..e5cbe7faa60a 100644 --- a/build.config.allmodconfig +++ b/build.config.allmodconfig @@ -9,6 +9,7 @@ function update_config() { -d CPU_BIG_ENDIAN \ -d DYNAMIC_FTRACE \ -e UNWINDER_FRAME_POINTER \ + -d WERROR \ (cd ${OUT_DIR} && \ make O=${OUT_DIR} $archsubarch CROSS_COMPILE=${CROSS_COMPILE} "${TOOL_ARGS[@]}" ${MAKE_ARGS} olddefconfig) From 25280f263d3c5f3d29fd00fc7bb032e99a17cfaa Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Wed, 16 Nov 2022 09:43:39 +0100 Subject: [PATCH 35/51] UPSTREAM: io_uring: kill goto error handling in io_sqpoll_wait_sq() Hunk extracted from commit 70aacfe66136809d7f080f89c492c278298719f4 upstream. If the sqpoll thread has died, the out condition doesn't remove the waiting task from the waitqueue. The goto and check are not needed, just make it a break condition after setting the error value. That ensures that we always remove ourselves from sqo_sq_wait waitqueue. Bug: 259534862 Reported-by: Xingyuan Mo Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman (cherry picked from commit 0f544353fec8e717d37724d95b92538e1de79e86) Signed-off-by: Lee Jones Change-Id: I453c3e23a2f0c5ce6a8dd73dac020ec6f32994ca --- fs/io_uring.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 5538906e47fe..7baab885c6f2 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -9029,7 +9029,7 @@ static int io_sqpoll_wait_sq(struct io_ring_ctx *ctx) if (unlikely(ctx->sqo_dead)) { ret = -EOWNERDEAD; - goto out; + break; } if (!io_sqring_full(ctx)) @@ -9039,7 +9039,6 @@ static int io_sqpoll_wait_sq(struct io_ring_ctx *ctx) } while (!signal_pending(current)); finish_wait(&ctx->sqo_sq_wait, &wait); -out: return ret; } From 3112d6f5021988af4a150be1142ee68419463f38 Mon Sep 17 00:00:00 2001 From: Chao Yu Date: Wed, 4 Aug 2021 10:23:48 +0800 Subject: [PATCH 36/51] BACKPORT: f2fs: extent cache: support unaligned extent Compressed inode may suffer read performance issue due to it can not use extent cache, so I propose to add this unaligned extent support to improve it. Currently, it only works in readonly format f2fs image. Unaligned extent: in one compressed cluster, physical block number will be less than logical block number, so we add an extra physical block length in extent info in order to indicate such extent status. The idea is if one whole cluster blocks are contiguous physically, once its mapping info was readed at first time, we will cache an unaligned (or aligned) extent info entry in extent cache, it expects that the mapping info will be hitted when rereading cluster. Merge policy: - Aligned extents can be merged. - Aligned extent and unaligned extent can not be merged. Bug: 264453689 Signed-off-by: Chao Yu Signed-off-by: Jaegeuk Kim (cherry picked from commit 627371ed31cf9f56ba66ad8dfabbc7bc9b7a4816) Change-Id: I106279145558f38dfa295c3e99fa03f6fcd306f4 --- fs/f2fs/compress.c | 24 ++++++++++++++++++++++++ fs/f2fs/data.c | 38 +++++++++++++++++++++++++++----------- fs/f2fs/extent_cache.c | 41 +++++++++++++++++++++++++++++++++++++++++ fs/f2fs/f2fs.h | 20 ++++++++++++++++++++ fs/f2fs/node.c | 20 ++++++++++++++++++++ 5 files changed, 132 insertions(+), 11 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 12c789f879a1..3116b38c2f6b 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1660,6 +1660,30 @@ void f2fs_put_page_dic(struct page *page) f2fs_put_dic(dic); } +/* + * check whether cluster blocks are contiguous, and add extent cache entry + * only if cluster blocks are logically and physically contiguous. + */ +unsigned int f2fs_cluster_blocks_are_contiguous(struct dnode_of_data *dn) +{ + bool compressed = f2fs_data_blkaddr(dn) == COMPRESS_ADDR; + int i = compressed ? 1 : 0; + block_t first_blkaddr = data_blkaddr(dn->inode, dn->node_page, + dn->ofs_in_node + i); + + for (i += 1; i < F2FS_I(dn->inode)->i_cluster_size; i++) { + block_t blkaddr = data_blkaddr(dn->inode, dn->node_page, + dn->ofs_in_node + i); + + if (!__is_valid_data_blkaddr(blkaddr)) + break; + if (first_blkaddr + i - (compressed ? 1 : 0) != blkaddr) + return 0; + } + + return compressed ? i - 1 : i; +} + const struct address_space_operations f2fs_compress_aops = { .releasepage = f2fs_release_page, .invalidatepage = f2fs_invalidate_page, diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 867b2b72ec67..279c2bf4d244 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1151,7 +1151,7 @@ int f2fs_reserve_block(struct dnode_of_data *dn, pgoff_t index) int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index) { - struct extent_info ei = {0, 0, 0}; + struct extent_info ei = {0, }; struct inode *inode = dn->inode; if (f2fs_lookup_extent_cache(inode, index, &ei)) { @@ -1168,7 +1168,7 @@ struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index, struct address_space *mapping = inode->i_mapping; struct dnode_of_data dn; struct page *page; - struct extent_info ei = {0,0,0}; + struct extent_info ei = {0, }; int err; page = f2fs_grab_cache_page(mapping, index, for_write); @@ -1466,7 +1466,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int err = 0, ofs = 1; unsigned int ofs_in_node, last_ofs_in_node; blkcnt_t prealloc; - struct extent_info ei = {0,0,0}; + struct extent_info ei = {0, }; block_t blkaddr; unsigned int start_pgofs; @@ -2156,6 +2156,8 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, sector_t last_block_in_file; const unsigned blocksize = blks_to_bytes(inode, 1); struct decompress_io_ctx *dic = NULL; + struct extent_info ei = {0, }; + bool from_dnode = true; int i; int ret = 0; @@ -2188,6 +2190,12 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, if (f2fs_cluster_is_empty(cc)) goto out; + if (f2fs_lookup_extent_cache(inode, start_idx, &ei)) + from_dnode = false; + + if (!from_dnode) + goto skip_reading_dnode; + set_new_dnode(&dn, inode, NULL, NULL, 0); ret = f2fs_get_dnode_of_data(&dn, start_idx, LOOKUP_NODE); if (ret) @@ -2195,11 +2203,13 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, f2fs_bug_on(sbi, dn.data_blkaddr != COMPRESS_ADDR); +skip_reading_dnode: for (i = 1; i < cc->cluster_size; i++) { block_t blkaddr; - blkaddr = data_blkaddr(dn.inode, dn.node_page, - dn.ofs_in_node + i); + blkaddr = from_dnode ? data_blkaddr(dn.inode, dn.node_page, + dn.ofs_in_node + i) : + ei.blk + i - 1; if (!__is_valid_data_blkaddr(blkaddr)) break; @@ -2209,6 +2219,9 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, goto out_put_dnode; } cc->nr_cpages++; + + if (!from_dnode && i >= ei.c_len) + break; } /* nothing to decompress */ @@ -2228,8 +2241,9 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, block_t blkaddr; struct bio_post_read_ctx *ctx; - blkaddr = data_blkaddr(dn.inode, dn.node_page, - dn.ofs_in_node + i + 1); + blkaddr = from_dnode ? data_blkaddr(dn.inode, dn.node_page, + dn.ofs_in_node + i + 1) : + ei.blk + i; f2fs_wait_on_block_writeback(inode, blkaddr); @@ -2274,13 +2288,15 @@ submit_and_realloc: *last_block_in_bio = blkaddr; } - f2fs_put_dnode(&dn); + if (from_dnode) + f2fs_put_dnode(&dn); *bio_ret = bio; return 0; out_put_dnode: - f2fs_put_dnode(&dn); + if (from_dnode) + f2fs_put_dnode(&dn); out: for (i = 0; i < cc->cluster_size; i++) { if (cc->rpages[i]) { @@ -2584,7 +2600,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio) struct page *page = fio->page; struct inode *inode = page->mapping->host; struct dnode_of_data dn; - struct extent_info ei = {0,0,0}; + struct extent_info ei = {0, }; struct node_info ni; bool ipu_force = false; int err = 0; @@ -3265,7 +3281,7 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi, struct dnode_of_data dn; struct page *ipage; bool locked = false; - struct extent_info ei = {0,0,0}; + struct extent_info ei = {0, }; int err = 0; int flag; diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index 3ebf976a682d..b120589d8517 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -661,6 +661,47 @@ static void f2fs_update_extent_tree_range(struct inode *inode, f2fs_mark_inode_dirty_sync(inode, true); } +#ifdef CONFIG_F2FS_FS_COMPRESSION +void f2fs_update_extent_tree_range_compressed(struct inode *inode, + pgoff_t fofs, block_t blkaddr, unsigned int llen, + unsigned int c_len) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + struct extent_tree *et = F2FS_I(inode)->extent_tree; + struct extent_node *en = NULL; + struct extent_node *prev_en = NULL, *next_en = NULL; + struct extent_info ei; + struct rb_node **insert_p = NULL, *insert_parent = NULL; + bool leftmost = false; + + trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, llen); + + /* it is safe here to check FI_NO_EXTENT w/o et->lock in ro image */ + if (is_inode_flag_set(inode, FI_NO_EXTENT)) + return; + + write_lock(&et->lock); + + en = (struct extent_node *)f2fs_lookup_rb_tree_ret(&et->root, + (struct rb_entry *)et->cached_en, fofs, + (struct rb_entry **)&prev_en, + (struct rb_entry **)&next_en, + &insert_p, &insert_parent, false, + &leftmost); + if (en) + goto unlock_out; + + set_extent_info(&ei, fofs, blkaddr, llen); + ei.c_len = c_len; + + if (!__try_merge_extent_node(sbi, et, &ei, prev_en, next_en)) + __insert_extent_tree(sbi, et, &ei, + insert_p, insert_parent, leftmost); +unlock_out: + write_unlock(&et->lock); +} +#endif + unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink) { struct extent_tree *et, *next; diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index d1af6cc71581..441c05968aa2 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -585,6 +585,9 @@ struct extent_info { unsigned int fofs; /* start offset in a file */ unsigned int len; /* length of the extent */ u32 blk; /* start block address of the extent */ +#ifdef CONFIG_F2FS_FS_COMPRESSION + unsigned int c_len; /* physical extent length of compressed blocks */ +#endif }; struct extent_node { @@ -805,6 +808,9 @@ static inline void set_extent_info(struct extent_info *ei, unsigned int fofs, ei->fofs = fofs; ei->blk = blk; ei->len = len; +#ifdef CONFIG_F2FS_FS_COMPRESSION + ei->c_len = 0; +#endif } static inline bool __is_discard_mergeable(struct discard_info *back, @@ -829,6 +835,12 @@ static inline bool __is_discard_front_mergeable(struct discard_info *cur, static inline bool __is_extent_mergeable(struct extent_info *back, struct extent_info *front) { +#ifdef CONFIG_F2FS_FS_COMPRESSION + if (back->c_len && back->len != back->c_len) + return false; + if (front->c_len && front->len != front->c_len) + return false; +#endif return (back->fofs + back->len == front->fofs && back->blk + back->len == front->blk); } @@ -4146,12 +4158,16 @@ int f2fs_write_multi_pages(struct compress_ctx *cc, struct writeback_control *wbc, enum iostat_type io_type); int f2fs_is_compressed_cluster(struct inode *inode, pgoff_t index); +void f2fs_update_extent_tree_range_compressed(struct inode *inode, + pgoff_t fofs, block_t blkaddr, unsigned int llen, + unsigned int c_len); int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, unsigned nr_pages, sector_t *last_block_in_bio, bool is_readahead, bool for_write); struct decompress_io_ctx *f2fs_alloc_dic(struct compress_ctx *cc); void f2fs_decompress_end_io(struct decompress_io_ctx *dic, bool failed); void f2fs_put_page_dic(struct page *page); +unsigned int f2fs_cluster_blocks_are_contiguous(struct dnode_of_data *dn); int f2fs_init_compress_ctx(struct compress_ctx *cc); void f2fs_destroy_compress_ctx(struct compress_ctx *cc, bool reuse); void f2fs_init_compress_info(struct f2fs_sb_info *sbi); @@ -4206,6 +4222,7 @@ static inline void f2fs_put_page_dic(struct page *page) { WARN_ON_ONCE(1); } +static inline unsigned int f2fs_cluster_blocks_are_contiguous(struct dnode_of_data *dn) { return 0; } static inline int f2fs_init_compress_inode(struct f2fs_sb_info *sbi) { return 0; } static inline void f2fs_destroy_compress_inode(struct f2fs_sb_info *sbi) { } static inline int f2fs_init_page_array_cache(struct f2fs_sb_info *sbi) { return 0; } @@ -4221,6 +4238,9 @@ static inline bool f2fs_load_compressed_page(struct f2fs_sb_info *sbi, static inline void f2fs_invalidate_compress_pages(struct f2fs_sb_info *sbi, nid_t ino) { } #define inc_compr_inode_stat(inode) do { } while (0) +static inline void f2fs_update_extent_tree_range_compressed(struct inode *inode, + pgoff_t fofs, block_t blkaddr, unsigned int llen, + unsigned int c_len) { } #endif static inline int set_compress_context(struct inode *inode) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ad96f7060c4d..2c71bcd2eee0 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -846,6 +846,26 @@ int f2fs_get_dnode_of_data(struct dnode_of_data *dn, pgoff_t index, int mode) dn->ofs_in_node = offset[level]; dn->node_page = npage[level]; dn->data_blkaddr = f2fs_data_blkaddr(dn); + + if (is_inode_flag_set(dn->inode, FI_COMPRESSED_FILE) && + f2fs_sb_has_readonly(sbi)) { + unsigned int c_len = f2fs_cluster_blocks_are_contiguous(dn); + block_t blkaddr; + + if (!c_len) + goto out; + + blkaddr = f2fs_data_blkaddr(dn); + if (blkaddr == COMPRESS_ADDR) + blkaddr = data_blkaddr(dn->inode, dn->node_page, + dn->ofs_in_node + 1); + + f2fs_update_extent_tree_range_compressed(dn->inode, + index, blkaddr, + F2FS_I(dn->inode)->i_cluster_size, + c_len); + } +out: return 0; release_pages: From f6b4d18df0db11ddb18515e736ae9d4961c8a00c Mon Sep 17 00:00:00 2001 From: Zhang Qilong Date: Mon, 5 Sep 2022 12:59:17 +0800 Subject: [PATCH 37/51] BACKPORT: f2fs: fix race condition on setting FI_NO_EXTENT flag The following scenarios exist. process A: process B: ->f2fs_drop_extent_tree ->f2fs_update_extent_cache_range ->f2fs_update_extent_tree_range ->write_lock ->set_inode_flag ->is_inode_flag_set ->__free_extent_tree // Shouldn't // have been // cleaned up // here ->write_lock In this case, the "FI_NO_EXTENT" flag is set between f2fs_update_extent_tree_range and is_inode_flag_set by other process. it leads to clearing the whole exten tree which should not have happened. And we fix it by move the setting it to the range of write_lock. Bug: 264453689 Fixes: 5f281fab9b9a3 ("f2fs: disable extent_cache for fcollapse/finsert inodes") Signed-off-by: Zhang Qilong Reviewed-by: Chao Yu Signed-off-by: Jaegeuk Kim (cherry picked from commit a597af3b5a967e810cc8155abaa49abf10d6c417) Change-Id: If36c3556c9062d46509f704f0491a7e6ad652ae6 --- fs/f2fs/extent_cache.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index b120589d8517..3a0d9401cb9b 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -803,9 +803,8 @@ void f2fs_drop_extent_tree(struct inode *inode) if (!f2fs_may_extent_tree(inode)) return; - set_inode_flag(inode, FI_NO_EXTENT); - write_lock(&et->lock); + set_inode_flag(inode, FI_NO_EXTENT); __free_extent_tree(sbi, et); if (et->largest.len) { et->largest.len = 0; From 02cb04cb05a8f694346cd0c7988d6d432f7bc811 Mon Sep 17 00:00:00 2001 From: Zhang Qilong Date: Mon, 19 Sep 2022 19:57:09 +0800 Subject: [PATCH 38/51] BACKPORT: f2fs: add "c_len" into trace_f2fs_update_extent_tree_range for compressed file The trace_f2fs_update_extent_tree_range could not record compressed block length in the cluster of compress file and we just add it. Bug: 264453689 Signed-off-by: Zhang Qilong Reviewed-by: Chao Yu Signed-off-by: Jaegeuk Kim (cherry picked from commit a95694e33ce998e808b3a72164e8df3b5e0faf87) Change-Id: Ic3702b2735be27cf1eab34191313b162805a010a --- fs/f2fs/extent_cache.c | 4 ++-- include/trace/events/f2fs.h | 13 +++++++++---- 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index 3a0d9401cb9b..d837750f14f6 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -543,7 +543,7 @@ static void f2fs_update_extent_tree_range(struct inode *inode, if (!et) return; - trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, len); + trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, len, 0); write_lock(&et->lock); @@ -674,7 +674,7 @@ void f2fs_update_extent_tree_range_compressed(struct inode *inode, struct rb_node **insert_p = NULL, *insert_parent = NULL; bool leftmost = false; - trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, llen); + trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, llen, c_len); /* it is safe here to check FI_NO_EXTENT w/o et->lock in ro image */ if (is_inode_flag_set(inode, FI_NO_EXTENT)) diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index df293bc7f03b..02c1ccc4f925 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -1586,9 +1586,10 @@ TRACE_EVENT_CONDITION(f2fs_lookup_extent_tree_end, TRACE_EVENT(f2fs_update_extent_tree_range, TP_PROTO(struct inode *inode, unsigned int pgofs, block_t blkaddr, - unsigned int len), + unsigned int len, + unsigned int c_len), - TP_ARGS(inode, pgofs, blkaddr, len), + TP_ARGS(inode, pgofs, blkaddr, len, c_len), TP_STRUCT__entry( __field(dev_t, dev) @@ -1596,6 +1597,7 @@ TRACE_EVENT(f2fs_update_extent_tree_range, __field(unsigned int, pgofs) __field(u32, blk) __field(unsigned int, len) + __field(unsigned int, c_len) ), TP_fast_assign( @@ -1604,14 +1606,17 @@ TRACE_EVENT(f2fs_update_extent_tree_range, __entry->pgofs = pgofs; __entry->blk = blkaddr; __entry->len = len; + __entry->c_len = c_len; ), TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, " - "blkaddr = %u, len = %u", + "blkaddr = %u, len = %u, " + "c_len = %u", show_dev_ino(__entry), __entry->pgofs, __entry->blk, - __entry->len) + __entry->len, + __entry->c_len) ); TRACE_EVENT(f2fs_shrink_extent_tree, From b29b3bd7e10d890e56e3e4a4fc31a575b9beb8db Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Wed, 30 Nov 2022 09:36:43 -0800 Subject: [PATCH 39/51] BACKPORT: f2fs: specify extent cache for read explicitly Let's descrbie it's read extent cache. Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit 29539ed310a92b5eb1e6ce8906087536b6259d58) Change-Id: I9cb258b7ee152ead1458ef8adb3e542532f152ce --- fs/f2fs/extent_cache.c | 4 ++-- fs/f2fs/f2fs.h | 10 +++++----- fs/f2fs/inode.c | 2 +- fs/f2fs/node.c | 2 +- fs/f2fs/node.h | 2 +- fs/f2fs/segment.c | 4 ++-- fs/f2fs/super.c | 12 ++++++------ 7 files changed, 18 insertions(+), 18 deletions(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index d837750f14f6..3129347a0f66 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -382,7 +382,7 @@ static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage) if (!i_ext || !i_ext->len) return; - get_extent_info(&ei, i_ext); + get_read_extent_info(&ei, i_ext); write_lock(&et->lock); if (atomic_read(&et->node_cnt)) @@ -709,7 +709,7 @@ unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink) unsigned int node_cnt = 0, tree_cnt = 0; int remained; - if (!test_opt(sbi, EXTENT_CACHE)) + if (!test_opt(sbi, READ_EXTENT_CACHE)) return 0; if (!atomic_read(&sbi->total_zombie_tree)) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 441c05968aa2..09ccdb0d599f 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -84,7 +84,7 @@ extern const char *f2fs_fault_name[FAULT_MAX]; #define F2FS_MOUNT_FLUSH_MERGE 0x00000400 #define F2FS_MOUNT_NOBARRIER 0x00000800 #define F2FS_MOUNT_FASTBOOT 0x00001000 -#define F2FS_MOUNT_EXTENT_CACHE 0x00002000 +#define F2FS_MOUNT_READ_EXTENT_CACHE 0x00002000 #define F2FS_MOUNT_DATA_FLUSH 0x00008000 #define F2FS_MOUNT_FAULT_INJECTION 0x00010000 #define F2FS_MOUNT_USRQUOTA 0x00080000 @@ -568,7 +568,7 @@ enum { #define F2FS_MIN_EXTENT_LEN 64 /* minimum extent length */ /* number of extent info in extent cache we try to shrink */ -#define EXTENT_CACHE_SHRINK_NUMBER 128 +#define READ_EXTENT_CACHE_SHRINK_NUMBER 128 struct rb_entry { struct rb_node rb_node; /* rb node located in rb-tree */ @@ -786,7 +786,7 @@ struct f2fs_inode_info { unsigned int i_cluster_size; /* cluster size */ }; -static inline void get_extent_info(struct extent_info *ext, +static inline void get_read_extent_info(struct extent_info *ext, struct f2fs_extent *i_ext) { ext->fofs = le32_to_cpu(i_ext->fofs); @@ -794,7 +794,7 @@ static inline void get_extent_info(struct extent_info *ext, ext->len = le32_to_cpu(i_ext->len); } -static inline void set_raw_extent(struct extent_info *ext, +static inline void set_raw_read_extent(struct extent_info *ext, struct f2fs_extent *i_ext) { i_ext->fofs = cpu_to_le32(ext->fofs); @@ -4314,7 +4314,7 @@ static inline bool f2fs_may_extent_tree(struct inode *inode) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - if (!test_opt(sbi, EXTENT_CACHE) || + if (!test_opt(sbi, READ_EXTENT_CACHE) || is_inode_flag_set(inode, FI_NO_EXTENT) || (is_inode_flag_set(inode, FI_COMPRESSED_FILE) && !f2fs_sb_has_readonly(sbi))) diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 29bf3e215f52..0b54b678137f 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -590,7 +590,7 @@ void f2fs_update_inode(struct inode *inode, struct page *node_page) if (et) { read_lock(&et->lock); - set_raw_extent(&et->largest, &ri->i_ext); + set_raw_read_extent(&et->largest, &ri->i_ext); read_unlock(&et->lock); } else { memset(&ri->i_ext, 0, sizeof(ri->i_ext)); diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 2c71bcd2eee0..a55b3676233c 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -83,7 +83,7 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type) sizeof(struct ino_entry); mem_size >>= PAGE_SHIFT; res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1); - } else if (type == EXTENT_CACHE) { + } else if (type == READ_EXTENT_CACHE) { mem_size = (atomic_read(&sbi->total_ext_tree) * sizeof(struct extent_tree) + atomic_read(&sbi->total_ext_node) * diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h index ff14a6e5ac1c..2c152e677a06 100644 --- a/fs/f2fs/node.h +++ b/fs/f2fs/node.h @@ -148,7 +148,7 @@ enum mem_type { NAT_ENTRIES, /* indicates the cached nat entry */ DIRTY_DENTS, /* indicates dirty dentry pages */ INO_ENTRIES, /* indicates inode entries */ - EXTENT_CACHE, /* indicates extent cache */ + READ_EXTENT_CACHE, /* indicates read extent cache */ INMEM_PAGES, /* indicates inmemory pages */ DISCARD_CACHE, /* indicates memory of cached discard cmds */ COMPRESS_PAGE, /* indicates memory of cached compressed pages */ diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index fdb41cb3fd68..ef599fbab740 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -536,8 +536,8 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi, bool from_bg) return; /* try to shrink extent cache when there is no enough memory */ - if (!f2fs_available_free_memory(sbi, EXTENT_CACHE)) - f2fs_shrink_extent_tree(sbi, EXTENT_CACHE_SHRINK_NUMBER); + if (!f2fs_available_free_memory(sbi, READ_EXTENT_CACHE)) + f2fs_shrink_extent_tree(sbi, READ_EXTENT_CACHE_SHRINK_NUMBER); /* check the # of cached NAT entries */ if (!f2fs_available_free_memory(sbi, NAT_ENTRIES)) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 89d90d366754..98da6a8636e0 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -753,10 +753,10 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount) set_opt(sbi, FASTBOOT); break; case Opt_extent_cache: - set_opt(sbi, EXTENT_CACHE); + set_opt(sbi, READ_EXTENT_CACHE); break; case Opt_noextent_cache: - clear_opt(sbi, EXTENT_CACHE); + clear_opt(sbi, READ_EXTENT_CACHE); break; case Opt_noinline_data: clear_opt(sbi, INLINE_DATA); @@ -1817,7 +1817,7 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root) seq_puts(seq, ",nobarrier"); if (test_opt(sbi, FASTBOOT)) seq_puts(seq, ",fastboot"); - if (test_opt(sbi, EXTENT_CACHE)) + if (test_opt(sbi, READ_EXTENT_CACHE)) seq_puts(seq, ",extent_cache"); else seq_puts(seq, ",noextent_cache"); @@ -1922,7 +1922,7 @@ static void default_options(struct f2fs_sb_info *sbi) set_opt(sbi, INLINE_XATTR); set_opt(sbi, INLINE_DATA); set_opt(sbi, INLINE_DENTRY); - set_opt(sbi, EXTENT_CACHE); + set_opt(sbi, READ_EXTENT_CACHE); set_opt(sbi, NOHEAP); clear_opt(sbi, DISABLE_CHECKPOINT); set_opt(sbi, MERGE_CHECKPOINT); @@ -2042,7 +2042,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data) bool need_restart_gc = false, need_stop_gc = false; bool need_restart_ckpt = false, need_stop_ckpt = false; bool need_restart_flush = false, need_stop_flush = false; - bool no_extent_cache = !test_opt(sbi, EXTENT_CACHE); + bool no_read_extent_cache = !test_opt(sbi, READ_EXTENT_CACHE); bool disable_checkpoint = test_opt(sbi, DISABLE_CHECKPOINT); bool no_io_align = !F2FS_IO_ALIGNED(sbi); bool no_atgc = !test_opt(sbi, ATGC); @@ -2132,7 +2132,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data) } /* disallow enable/disable extent_cache dynamically */ - if (no_extent_cache == !!test_opt(sbi, EXTENT_CACHE)) { + if (no_read_extent_cache == !!test_opt(sbi, READ_EXTENT_CACHE)) { err = -EINVAL; f2fs_warn(sbi, "switch extent_cache option is not allowed"); goto restore_opts; From bf3cafe7f197b599848f075c3976b3ce195fa9a1 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Wed, 30 Nov 2022 09:44:58 -0800 Subject: [PATCH 40/51] BACKPORT: f2fs: move internal functions into extent_cache.c No functional change. Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit f458f39556cec5660a5508fbbeea4917bc46bc9c) Change-Id: I6171ab293b214a2eeb3f420517216371ee29557e --- fs/f2fs/extent_cache.c | 88 +++++++++++++++++++++++++++++++++++++----- fs/f2fs/f2fs.h | 69 +-------------------------------- 2 files changed, 81 insertions(+), 76 deletions(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index 3129347a0f66..55319f1ed9d1 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -15,6 +15,77 @@ #include "node.h" #include +static void __set_extent_info(struct extent_info *ei, + unsigned int fofs, unsigned int len, + block_t blk, bool keep_clen) +{ + ei->fofs = fofs; + ei->blk = blk; + ei->len = len; + + if (keep_clen) + return; + +#ifdef CONFIG_F2FS_FS_COMPRESSION + ei->c_len = 0; +#endif +} + +static bool f2fs_may_extent_tree(struct inode *inode) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + + /* + * for recovered files during mount do not create extents + * if shrinker is not registered. + */ + if (list_empty(&sbi->s_list)) + return false; + + if (!test_opt(sbi, READ_EXTENT_CACHE) || + is_inode_flag_set(inode, FI_NO_EXTENT) || + (is_inode_flag_set(inode, FI_COMPRESSED_FILE) && + !f2fs_sb_has_readonly(sbi))) + return false; + + return S_ISREG(inode->i_mode); +} + +static void __try_update_largest_extent(struct extent_tree *et, + struct extent_node *en) +{ + if (en->ei.len <= et->largest.len) + return; + + et->largest = en->ei; + et->largest_updated = true; +} + +static bool __is_extent_mergeable(struct extent_info *back, + struct extent_info *front) +{ +#ifdef CONFIG_F2FS_FS_COMPRESSION + if (back->c_len && back->len != back->c_len) + return false; + if (front->c_len && front->len != front->c_len) + return false; +#endif + return (back->fofs + back->len == front->fofs && + back->blk + back->len == front->blk); +} + +static bool __is_back_mergeable(struct extent_info *cur, + struct extent_info *back) +{ + return __is_extent_mergeable(back, cur); +} + +static bool __is_front_mergeable(struct extent_info *cur, + struct extent_info *front) +{ + return __is_extent_mergeable(cur, front); +} + static struct rb_entry *__lookup_rb_tree_fast(struct rb_entry *cached_re, unsigned int ofs) { @@ -590,16 +661,16 @@ static void f2fs_update_extent_tree_range(struct inode *inode, if (end < org_end && org_end - end >= F2FS_MIN_EXTENT_LEN) { if (parts) { - set_extent_info(&ei, end, - end - dei.fofs + dei.blk, - org_end - end); + __set_extent_info(&ei, + end, org_end - end, + end - dei.fofs + dei.blk, false); en1 = __insert_extent_tree(sbi, et, &ei, NULL, NULL, true); next_en = en1; } else { - en->ei.fofs = end; - en->ei.blk += end - dei.fofs; - en->ei.len -= end - dei.fofs; + __set_extent_info(&en->ei, + end, en->ei.len - (end - dei.fofs), + en->ei.blk + (end - dei.fofs), true); next_en = en; } parts++; @@ -631,8 +702,7 @@ static void f2fs_update_extent_tree_range(struct inode *inode, /* 3. update extent in extent cache */ if (blkaddr) { - - set_extent_info(&ei, fofs, blkaddr, len); + __set_extent_info(&ei, fofs, len, blkaddr, false); if (!__try_merge_extent_node(sbi, et, &ei, prev_en, next_en)) __insert_extent_tree(sbi, et, &ei, insert_p, insert_parent, leftmost); @@ -691,7 +761,7 @@ void f2fs_update_extent_tree_range_compressed(struct inode *inode, if (en) goto unlock_out; - set_extent_info(&ei, fofs, blkaddr, llen); + __set_extent_info(&ei, fofs, llen, blkaddr, true); ei.c_len = c_len; if (!__try_merge_extent_node(sbi, et, &ei, prev_en, next_en)) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 09ccdb0d599f..3ecb53f2f67f 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -584,7 +584,7 @@ struct rb_entry { struct extent_info { unsigned int fofs; /* start offset in a file */ unsigned int len; /* length of the extent */ - u32 blk; /* start block address of the extent */ + block_t blk; /* start block address of the extent */ #ifdef CONFIG_F2FS_FS_COMPRESSION unsigned int c_len; /* physical extent length of compressed blocks */ #endif @@ -802,17 +802,6 @@ static inline void set_raw_read_extent(struct extent_info *ext, i_ext->len = cpu_to_le32(ext->len); } -static inline void set_extent_info(struct extent_info *ei, unsigned int fofs, - u32 blk, unsigned int len) -{ - ei->fofs = fofs; - ei->blk = blk; - ei->len = len; -#ifdef CONFIG_F2FS_FS_COMPRESSION - ei->c_len = 0; -#endif -} - static inline bool __is_discard_mergeable(struct discard_info *back, struct discard_info *front, unsigned int max_len) { @@ -832,41 +821,6 @@ static inline bool __is_discard_front_mergeable(struct discard_info *cur, return __is_discard_mergeable(cur, front, max_len); } -static inline bool __is_extent_mergeable(struct extent_info *back, - struct extent_info *front) -{ -#ifdef CONFIG_F2FS_FS_COMPRESSION - if (back->c_len && back->len != back->c_len) - return false; - if (front->c_len && front->len != front->c_len) - return false; -#endif - return (back->fofs + back->len == front->fofs && - back->blk + back->len == front->blk); -} - -static inline bool __is_back_mergeable(struct extent_info *cur, - struct extent_info *back) -{ - return __is_extent_mergeable(back, cur); -} - -static inline bool __is_front_mergeable(struct extent_info *cur, - struct extent_info *front) -{ - return __is_extent_mergeable(cur, front); -} - -extern void f2fs_mark_inode_dirty_sync(struct inode *inode, bool sync); -static inline void __try_update_largest_extent(struct extent_tree *et, - struct extent_node *en) -{ - if (en->ei.len > et->largest.len) { - et->largest = en->ei; - et->largest_updated = true; - } -} - /* * For free nid management */ @@ -2492,6 +2446,7 @@ static inline block_t __start_sum_addr(struct f2fs_sb_info *sbi) return le32_to_cpu(F2FS_CKPT(sbi)->cp_pack_start_sum); } +extern void f2fs_mark_inode_dirty_sync(struct inode *inode, bool sync); static inline int inc_valid_node_count(struct f2fs_sb_info *sbi, struct inode *inode, bool is_inode) { @@ -4310,26 +4265,6 @@ F2FS_FEATURE_FUNCS(casefold, CASEFOLD); F2FS_FEATURE_FUNCS(compression, COMPRESSION); F2FS_FEATURE_FUNCS(readonly, RO); -static inline bool f2fs_may_extent_tree(struct inode *inode) -{ - struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - - if (!test_opt(sbi, READ_EXTENT_CACHE) || - is_inode_flag_set(inode, FI_NO_EXTENT) || - (is_inode_flag_set(inode, FI_COMPRESSED_FILE) && - !f2fs_sb_has_readonly(sbi))) - return false; - - /* - * for recovered files during mount do not create extents - * if shrinker is not registered. - */ - if (list_empty(&sbi->s_list)) - return false; - - return S_ISREG(inode->i_mode); -} - #ifdef CONFIG_BLK_DEV_ZONED static inline bool f2fs_blkz_is_seq(struct f2fs_sb_info *sbi, int devi, block_t blkaddr) From 561e9febb3fe985a68bca33c35b3395ef141334b Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Wed, 30 Nov 2022 10:01:18 -0800 Subject: [PATCH 41/51] BACKPORT: f2fs: remove unnecessary __init_extent_tree Added into the caller. Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit 0b461f459f0f53ff29b0ff98d4c18808b4248dcc) Change-Id: Ic52ba22c00055bfe85bb789e6fd057d5fa84c00f --- fs/f2fs/extent_cache.c | 21 +++++---------------- 1 file changed, 5 insertions(+), 16 deletions(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index 55319f1ed9d1..adee87d4b18c 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -385,21 +385,6 @@ static struct extent_tree *__grab_extent_tree(struct inode *inode) return et; } -static struct extent_node *__init_extent_tree(struct f2fs_sb_info *sbi, - struct extent_tree *et, struct extent_info *ei) -{ - struct rb_node **p = &et->root.rb_root.rb_node; - struct extent_node *en; - - en = __attach_extent_node(sbi, et, ei, NULL, p, true); - if (!en) - return NULL; - - et->largest = en->ei; - et->cached_en = en; - return en; -} - static unsigned int __free_extent_tree(struct f2fs_sb_info *sbi, struct extent_tree *et) { @@ -459,8 +444,12 @@ static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage) if (atomic_read(&et->node_cnt)) goto out; - en = __init_extent_tree(sbi, et, &ei); + en = __attach_extent_node(sbi, et, &ei, NULL, + &et->root.rb_root.rb_node, true); if (en) { + et->largest = en->ei; + et->cached_en = en; + spin_lock(&sbi->extent_lock); list_add_tail(&en->list, &sbi->extent_list); spin_unlock(&sbi->extent_lock); From 72e9dd90cf433ca66645732af5e073f37060df34 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Wed, 30 Nov 2022 09:26:29 -0800 Subject: [PATCH 42/51] BACKPORT: f2fs: refactor extent_cache to support for read and more This patch prepares extent_cache to be ready for addition. Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit 7cf42d77c7242d23988215297bdd2d215e208b6f) Change-Id: I5ca06c274529187b804ddd4b0834dc44fe6aa8ad --- fs/f2fs/data.c | 18 +- fs/f2fs/debug.c | 65 +++-- fs/f2fs/extent_cache.c | 465 +++++++++++++++++++++--------------- fs/f2fs/f2fs.h | 111 ++++++--- fs/f2fs/file.c | 8 +- fs/f2fs/gc.c | 4 +- fs/f2fs/inode.c | 6 +- fs/f2fs/node.c | 8 +- fs/f2fs/segment.c | 3 +- fs/f2fs/shrinker.c | 19 +- include/trace/events/f2fs.h | 62 +++-- 11 files changed, 466 insertions(+), 303 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 279c2bf4d244..991b4f9593b1 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1085,7 +1085,7 @@ void f2fs_update_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr) { dn->data_blkaddr = blkaddr; f2fs_set_data_blkaddr(dn); - f2fs_update_extent_cache(dn); + f2fs_update_read_extent_cache(dn); } /* dn->ofs_in_node will be returned with up-to-date last block pointer */ @@ -1154,7 +1154,7 @@ int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index) struct extent_info ei = {0, }; struct inode *inode = dn->inode; - if (f2fs_lookup_extent_cache(inode, index, &ei)) { + if (f2fs_lookup_read_extent_cache(inode, index, &ei)) { dn->data_blkaddr = ei.blk + index - ei.fofs; return 0; } @@ -1175,7 +1175,7 @@ struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index, if (!page) return ERR_PTR(-ENOMEM); - if (f2fs_lookup_extent_cache(inode, index, &ei)) { + if (f2fs_lookup_read_extent_cache(inode, index, &ei)) { dn.data_blkaddr = ei.blk + index - ei.fofs; if (!f2fs_is_valid_blkaddr(F2FS_I_SB(inode), dn.data_blkaddr, DATA_GENERIC_ENHANCE_READ)) { @@ -1480,7 +1480,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, pgofs = (pgoff_t)map->m_lblk; end = pgofs + maxblocks; - if (!create && f2fs_lookup_extent_cache(inode, pgofs, &ei)) { + if (!create && f2fs_lookup_read_extent_cache(inode, pgofs, &ei)) { if (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO && map->m_may_create) goto next_dnode; @@ -1654,7 +1654,7 @@ skip: if (map->m_flags & F2FS_MAP_MAPPED) { unsigned int ofs = start_pgofs - map->m_lblk; - f2fs_update_extent_cache_range(&dn, + f2fs_update_read_extent_cache_range(&dn, start_pgofs, map->m_pblk + ofs, map->m_len - ofs); } @@ -1679,7 +1679,7 @@ sync_out: if (map->m_flags & F2FS_MAP_MAPPED) { unsigned int ofs = start_pgofs - map->m_lblk; - f2fs_update_extent_cache_range(&dn, + f2fs_update_read_extent_cache_range(&dn, start_pgofs, map->m_pblk + ofs, map->m_len - ofs); } @@ -2190,7 +2190,7 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, if (f2fs_cluster_is_empty(cc)) goto out; - if (f2fs_lookup_extent_cache(inode, start_idx, &ei)) + if (f2fs_lookup_read_extent_cache(inode, start_idx, &ei)) from_dnode = false; if (!from_dnode) @@ -2607,7 +2607,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio) set_new_dnode(&dn, inode, NULL, NULL, 0); if (need_inplace_update(fio) && - f2fs_lookup_extent_cache(inode, page->index, &ei)) { + f2fs_lookup_read_extent_cache(inode, page->index, &ei)) { fio->old_blkaddr = ei.blk + page->index - ei.fofs; if (!f2fs_is_valid_blkaddr(fio->sbi, fio->old_blkaddr, @@ -3332,7 +3332,7 @@ restart: } else if (locked) { err = f2fs_get_block(&dn, index); } else { - if (f2fs_lookup_extent_cache(inode, index, &ei)) { + if (f2fs_lookup_read_extent_cache(inode, index, &ei)) { dn.data_blkaddr = ei.blk + index - ei.fofs; } else { /* hole case */ diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c index 53ed1e9191f0..75564077f2e9 100644 --- a/fs/f2fs/debug.c +++ b/fs/f2fs/debug.c @@ -72,15 +72,23 @@ static void update_general_status(struct f2fs_sb_info *sbi) si->main_area_zones = si->main_area_sections / le32_to_cpu(raw_super->secs_per_zone); - /* validation check of the segment numbers */ + /* general extent cache stats */ + for (i = 0; i < NR_EXTENT_CACHES; i++) { + struct extent_tree_info *eti = &sbi->extent_tree[i]; + + si->hit_cached[i] = atomic64_read(&sbi->read_hit_cached[i]); + si->hit_rbtree[i] = atomic64_read(&sbi->read_hit_rbtree[i]); + si->total_ext[i] = atomic64_read(&sbi->total_hit_ext[i]); + si->hit_total[i] = si->hit_cached[i] + si->hit_rbtree[i]; + si->ext_tree[i] = atomic_read(&eti->total_ext_tree); + si->zombie_tree[i] = atomic_read(&eti->total_zombie_tree); + si->ext_node[i] = atomic_read(&eti->total_ext_node); + } + /* read extent_cache only */ si->hit_largest = atomic64_read(&sbi->read_hit_largest); - si->hit_cached = atomic64_read(&sbi->read_hit_cached); - si->hit_rbtree = atomic64_read(&sbi->read_hit_rbtree); - si->hit_total = si->hit_largest + si->hit_cached + si->hit_rbtree; - si->total_ext = atomic64_read(&sbi->total_hit_ext); - si->ext_tree = atomic_read(&sbi->total_ext_tree); - si->zombie_tree = atomic_read(&sbi->total_zombie_tree); - si->ext_node = atomic_read(&sbi->total_ext_node); + si->hit_total[EX_READ] += si->hit_largest; + + /* validation check of the segment numbers */ si->ndirty_node = get_pages(sbi, F2FS_DIRTY_NODES); si->ndirty_dent = get_pages(sbi, F2FS_DIRTY_DENTS); si->ndirty_meta = get_pages(sbi, F2FS_DIRTY_META); @@ -299,10 +307,16 @@ get_cache: si->cache_mem += si->inmem_pages * sizeof(struct inmem_pages); for (i = 0; i < MAX_INO_ENTRY; i++) si->cache_mem += sbi->im[i].ino_num * sizeof(struct ino_entry); - si->cache_mem += atomic_read(&sbi->total_ext_tree) * + + for (i = 0; i < NR_EXTENT_CACHES; i++) { + struct extent_tree_info *eti = &sbi->extent_tree[i]; + + si->ext_mem[i] = atomic_read(&eti->total_ext_tree) * sizeof(struct extent_tree); - si->cache_mem += atomic_read(&sbi->total_ext_node) * + si->ext_mem[i] += atomic_read(&eti->total_ext_node) * sizeof(struct extent_node); + si->cache_mem += si->ext_mem[i]; + } si->page_mem = 0; if (sbi->node_inode) { @@ -471,16 +485,18 @@ static int stat_show(struct seq_file *s, void *v) si->skipped_atomic_files[BG_GC]); seq_printf(s, "BG skip : IO: %u, Other: %u\n", si->io_skip_bggc, si->other_skip_bggc); - seq_puts(s, "\nExtent Cache:\n"); + seq_puts(s, "\nExtent Cache (Read):\n"); seq_printf(s, " - Hit Count: L1-1:%llu L1-2:%llu L2:%llu\n", - si->hit_largest, si->hit_cached, - si->hit_rbtree); + si->hit_largest, si->hit_cached[EX_READ], + si->hit_rbtree[EX_READ]); seq_printf(s, " - Hit Ratio: %llu%% (%llu / %llu)\n", - !si->total_ext ? 0 : - div64_u64(si->hit_total * 100, si->total_ext), - si->hit_total, si->total_ext); + !si->total_ext[EX_READ] ? 0 : + div64_u64(si->hit_total[EX_READ] * 100, + si->total_ext[EX_READ]), + si->hit_total[EX_READ], si->total_ext[EX_READ]); seq_printf(s, " - Inner Struct Count: tree: %d(%d), node: %d\n", - si->ext_tree, si->zombie_tree, si->ext_node); + si->ext_tree[EX_READ], si->zombie_tree[EX_READ], + si->ext_node[EX_READ]); seq_puts(s, "\nBalancing F2FS Async:\n"); seq_printf(s, " - DIO (R: %4d, W: %4d)\n", si->nr_dio_read, si->nr_dio_write); @@ -546,8 +562,10 @@ static int stat_show(struct seq_file *s, void *v) (si->base_mem + si->cache_mem + si->page_mem) >> 10); seq_printf(s, " - static: %llu KB\n", si->base_mem >> 10); - seq_printf(s, " - cached: %llu KB\n", + seq_printf(s, " - cached all: %llu KB\n", si->cache_mem >> 10); + seq_printf(s, " - read extent cache: %llu KB\n", + si->ext_mem[EX_READ] >> 10); seq_printf(s, " - paged : %llu KB\n", si->page_mem >> 10); } @@ -579,10 +597,15 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi) si->sbi = sbi; sbi->stat_info = si; - atomic64_set(&sbi->total_hit_ext, 0); - atomic64_set(&sbi->read_hit_rbtree, 0); + /* general extent cache stats */ + for (i = 0; i < NR_EXTENT_CACHES; i++) { + atomic64_set(&sbi->total_hit_ext[i], 0); + atomic64_set(&sbi->read_hit_rbtree[i], 0); + atomic64_set(&sbi->read_hit_cached[i], 0); + } + + /* read extent_cache only */ atomic64_set(&sbi->read_hit_largest, 0); - atomic64_set(&sbi->read_hit_cached, 0); atomic_set(&sbi->inline_xattr, 0); atomic_set(&sbi->inline_inode, 0); diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index adee87d4b18c..e2ef1b99511b 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -17,21 +17,37 @@ static void __set_extent_info(struct extent_info *ei, unsigned int fofs, unsigned int len, - block_t blk, bool keep_clen) + block_t blk, bool keep_clen, + enum extent_type type) { ei->fofs = fofs; - ei->blk = blk; ei->len = len; - if (keep_clen) - return; - + if (type == EX_READ) { + ei->blk = blk; + if (keep_clen) + return; #ifdef CONFIG_F2FS_FS_COMPRESSION - ei->c_len = 0; + ei->c_len = 0; #endif + } } -static bool f2fs_may_extent_tree(struct inode *inode) +static bool __may_read_extent_tree(struct inode *inode) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + + if (!test_opt(sbi, READ_EXTENT_CACHE)) + return false; + if (is_inode_flag_set(inode, FI_NO_EXTENT)) + return false; + if (is_inode_flag_set(inode, FI_COMPRESSED_FILE) && + !f2fs_sb_has_readonly(sbi)) + return false; + return S_ISREG(inode->i_mode); +} + +static bool __may_extent_tree(struct inode *inode, enum extent_type type) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); @@ -42,18 +58,16 @@ static bool f2fs_may_extent_tree(struct inode *inode) if (list_empty(&sbi->s_list)) return false; - if (!test_opt(sbi, READ_EXTENT_CACHE) || - is_inode_flag_set(inode, FI_NO_EXTENT) || - (is_inode_flag_set(inode, FI_COMPRESSED_FILE) && - !f2fs_sb_has_readonly(sbi))) - return false; - - return S_ISREG(inode->i_mode); + if (type == EX_READ) + return __may_read_extent_tree(inode); + return false; } static void __try_update_largest_extent(struct extent_tree *et, struct extent_node *en) { + if (et->type != EX_READ) + return; if (en->ei.len <= et->largest.len) return; @@ -62,28 +76,31 @@ static void __try_update_largest_extent(struct extent_tree *et, } static bool __is_extent_mergeable(struct extent_info *back, - struct extent_info *front) + struct extent_info *front, enum extent_type type) { + if (type == EX_READ) { #ifdef CONFIG_F2FS_FS_COMPRESSION - if (back->c_len && back->len != back->c_len) - return false; - if (front->c_len && front->len != front->c_len) - return false; + if (back->c_len && back->len != back->c_len) + return false; + if (front->c_len && front->len != front->c_len) + return false; #endif - return (back->fofs + back->len == front->fofs && - back->blk + back->len == front->blk); + return (back->fofs + back->len == front->fofs && + back->blk + back->len == front->blk); + } + return false; } static bool __is_back_mergeable(struct extent_info *cur, - struct extent_info *back) + struct extent_info *back, enum extent_type type) { - return __is_extent_mergeable(back, cur); + return __is_extent_mergeable(back, cur, type); } static bool __is_front_mergeable(struct extent_info *cur, - struct extent_info *front) + struct extent_info *front, enum extent_type type) { - return __is_extent_mergeable(cur, front); + return __is_extent_mergeable(cur, front, type); } static struct rb_entry *__lookup_rb_tree_fast(struct rb_entry *cached_re, @@ -308,6 +325,7 @@ static struct extent_node *__attach_extent_node(struct f2fs_sb_info *sbi, struct rb_node *parent, struct rb_node **p, bool leftmost) { + struct extent_tree_info *eti = &sbi->extent_tree[et->type]; struct extent_node *en; en = kmem_cache_alloc(extent_node_slab, GFP_ATOMIC); @@ -321,16 +339,18 @@ static struct extent_node *__attach_extent_node(struct f2fs_sb_info *sbi, rb_link_node(&en->rb_node, parent, p); rb_insert_color_cached(&en->rb_node, &et->root, leftmost); atomic_inc(&et->node_cnt); - atomic_inc(&sbi->total_ext_node); + atomic_inc(&eti->total_ext_node); return en; } static void __detach_extent_node(struct f2fs_sb_info *sbi, struct extent_tree *et, struct extent_node *en) { + struct extent_tree_info *eti = &sbi->extent_tree[et->type]; + rb_erase_cached(&en->rb_node, &et->root); atomic_dec(&et->node_cnt); - atomic_dec(&sbi->total_ext_node); + atomic_dec(&eti->total_ext_node); if (et->cached_en == en) et->cached_en = NULL; @@ -346,41 +366,46 @@ static void __detach_extent_node(struct f2fs_sb_info *sbi, static void __release_extent_node(struct f2fs_sb_info *sbi, struct extent_tree *et, struct extent_node *en) { - spin_lock(&sbi->extent_lock); + struct extent_tree_info *eti = &sbi->extent_tree[et->type]; + + spin_lock(&eti->extent_lock); f2fs_bug_on(sbi, list_empty(&en->list)); list_del_init(&en->list); - spin_unlock(&sbi->extent_lock); + spin_unlock(&eti->extent_lock); __detach_extent_node(sbi, et, en); } -static struct extent_tree *__grab_extent_tree(struct inode *inode) +static struct extent_tree *__grab_extent_tree(struct inode *inode, + enum extent_type type) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + struct extent_tree_info *eti = &sbi->extent_tree[type]; struct extent_tree *et; nid_t ino = inode->i_ino; - mutex_lock(&sbi->extent_tree_lock); - et = radix_tree_lookup(&sbi->extent_tree_root, ino); + mutex_lock(&eti->extent_tree_lock); + et = radix_tree_lookup(&eti->extent_tree_root, ino); if (!et) { et = f2fs_kmem_cache_alloc(extent_tree_slab, GFP_NOFS); - f2fs_radix_tree_insert(&sbi->extent_tree_root, ino, et); + f2fs_radix_tree_insert(&eti->extent_tree_root, ino, et); memset(et, 0, sizeof(struct extent_tree)); et->ino = ino; + et->type = type; et->root = RB_ROOT_CACHED; et->cached_en = NULL; rwlock_init(&et->lock); INIT_LIST_HEAD(&et->list); atomic_set(&et->node_cnt, 0); - atomic_inc(&sbi->total_ext_tree); + atomic_inc(&eti->total_ext_tree); } else { - atomic_dec(&sbi->total_zombie_tree); + atomic_dec(&eti->total_zombie_tree); list_del_init(&et->list); } - mutex_unlock(&sbi->extent_tree_lock); + mutex_unlock(&eti->extent_tree_lock); /* never died until evict_inode */ - F2FS_I(inode)->extent_tree = et; + F2FS_I(inode)->extent_tree[type] = et; return et; } @@ -414,35 +439,38 @@ static void __drop_largest_extent(struct extent_tree *et, } /* return true, if inode page is changed */ -static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage) +static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage, + enum extent_type type) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + struct extent_tree_info *eti = &sbi->extent_tree[type]; struct f2fs_extent *i_ext = ipage ? &F2FS_INODE(ipage)->i_ext : NULL; struct extent_tree *et; struct extent_node *en; struct extent_info ei; - if (!f2fs_may_extent_tree(inode)) { - /* drop largest extent */ - if (i_ext && i_ext->len) { + if (!__may_extent_tree(inode, type)) { + /* drop largest read extent */ + if (type == EX_READ && i_ext && i_ext->len) { f2fs_wait_on_page_writeback(ipage, NODE, true, true); i_ext->len = 0; set_page_dirty(ipage); - return; } - return; + goto out; } - et = __grab_extent_tree(inode); + et = __grab_extent_tree(inode, type); if (!i_ext || !i_ext->len) - return; + goto out; + + BUG_ON(type != EX_READ); get_read_extent_info(&ei, i_ext); write_lock(&et->lock); if (atomic_read(&et->node_cnt)) - goto out; + goto unlock_out; en = __attach_extent_node(sbi, et, &ei, NULL, &et->root.rb_root.rb_node, true); @@ -450,37 +478,40 @@ static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage) et->largest = en->ei; et->cached_en = en; - spin_lock(&sbi->extent_lock); - list_add_tail(&en->list, &sbi->extent_list); - spin_unlock(&sbi->extent_lock); + spin_lock(&eti->extent_lock); + list_add_tail(&en->list, &eti->extent_list); + spin_unlock(&eti->extent_lock); } -out: +unlock_out: write_unlock(&et->lock); +out: + if (type == EX_READ && !F2FS_I(inode)->extent_tree[EX_READ]) + set_inode_flag(inode, FI_NO_EXTENT); } void f2fs_init_extent_tree(struct inode *inode, struct page *ipage) { - __f2fs_init_extent_tree(inode, ipage); - - if (!F2FS_I(inode)->extent_tree) - set_inode_flag(inode, FI_NO_EXTENT); + /* initialize read cache */ + __f2fs_init_extent_tree(inode, ipage, EX_READ); } -static bool f2fs_lookup_extent_tree(struct inode *inode, pgoff_t pgofs, - struct extent_info *ei) +static bool __lookup_extent_tree(struct inode *inode, pgoff_t pgofs, + struct extent_info *ei, enum extent_type type) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct extent_tree *et = F2FS_I(inode)->extent_tree; + struct extent_tree_info *eti = &sbi->extent_tree[type]; + struct extent_tree *et = F2FS_I(inode)->extent_tree[type]; struct extent_node *en; bool ret = false; f2fs_bug_on(sbi, !et); - trace_f2fs_lookup_extent_tree_start(inode, pgofs); + trace_f2fs_lookup_extent_tree_start(inode, pgofs, type); read_lock(&et->lock); - if (et->largest.fofs <= pgofs && + if (type == EX_READ && + et->largest.fofs <= pgofs && et->largest.fofs + et->largest.len > pgofs) { *ei = et->largest; ret = true; @@ -494,23 +525,24 @@ static bool f2fs_lookup_extent_tree(struct inode *inode, pgoff_t pgofs, goto out; if (en == et->cached_en) - stat_inc_cached_node_hit(sbi); + stat_inc_cached_node_hit(sbi, type); else - stat_inc_rbtree_node_hit(sbi); + stat_inc_rbtree_node_hit(sbi, type); *ei = en->ei; - spin_lock(&sbi->extent_lock); + spin_lock(&eti->extent_lock); if (!list_empty(&en->list)) { - list_move_tail(&en->list, &sbi->extent_list); + list_move_tail(&en->list, &eti->extent_list); et->cached_en = en; } - spin_unlock(&sbi->extent_lock); + spin_unlock(&eti->extent_lock); ret = true; out: - stat_inc_total_hit(sbi); + stat_inc_total_hit(sbi, type); read_unlock(&et->lock); - trace_f2fs_lookup_extent_tree_end(inode, pgofs, ei); + if (type == EX_READ) + trace_f2fs_lookup_read_extent_tree_end(inode, pgofs, ei); return ret; } @@ -519,18 +551,20 @@ static struct extent_node *__try_merge_extent_node(struct f2fs_sb_info *sbi, struct extent_node *prev_ex, struct extent_node *next_ex) { + struct extent_tree_info *eti = &sbi->extent_tree[et->type]; struct extent_node *en = NULL; - if (prev_ex && __is_back_mergeable(ei, &prev_ex->ei)) { + if (prev_ex && __is_back_mergeable(ei, &prev_ex->ei, et->type)) { prev_ex->ei.len += ei->len; ei = &prev_ex->ei; en = prev_ex; } - if (next_ex && __is_front_mergeable(ei, &next_ex->ei)) { + if (next_ex && __is_front_mergeable(ei, &next_ex->ei, et->type)) { next_ex->ei.fofs = ei->fofs; - next_ex->ei.blk = ei->blk; next_ex->ei.len += ei->len; + if (et->type == EX_READ) + next_ex->ei.blk = ei->blk; if (en) __release_extent_node(sbi, et, prev_ex); @@ -542,12 +576,12 @@ static struct extent_node *__try_merge_extent_node(struct f2fs_sb_info *sbi, __try_update_largest_extent(et, en); - spin_lock(&sbi->extent_lock); + spin_lock(&eti->extent_lock); if (!list_empty(&en->list)) { - list_move_tail(&en->list, &sbi->extent_list); + list_move_tail(&en->list, &eti->extent_list); et->cached_en = en; } - spin_unlock(&sbi->extent_lock); + spin_unlock(&eti->extent_lock); return en; } @@ -557,6 +591,7 @@ static struct extent_node *__insert_extent_tree(struct f2fs_sb_info *sbi, struct rb_node *insert_parent, bool leftmost) { + struct extent_tree_info *eti = &sbi->extent_tree[et->type]; struct rb_node **p; struct rb_node *parent = NULL; struct extent_node *en = NULL; @@ -579,48 +614,51 @@ do_insert: __try_update_largest_extent(et, en); /* update in global extent list */ - spin_lock(&sbi->extent_lock); - list_add_tail(&en->list, &sbi->extent_list); + spin_lock(&eti->extent_lock); + list_add_tail(&en->list, &eti->extent_list); et->cached_en = en; - spin_unlock(&sbi->extent_lock); + spin_unlock(&eti->extent_lock); return en; } -static void f2fs_update_extent_tree_range(struct inode *inode, - pgoff_t fofs, block_t blkaddr, unsigned int len) +static void __update_extent_tree_range(struct inode *inode, + struct extent_info *tei, enum extent_type type) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct extent_tree *et = F2FS_I(inode)->extent_tree; + struct extent_tree *et = F2FS_I(inode)->extent_tree[type]; struct extent_node *en = NULL, *en1 = NULL; struct extent_node *prev_en = NULL, *next_en = NULL; struct extent_info ei, dei, prev; struct rb_node **insert_p = NULL, *insert_parent = NULL; + unsigned int fofs = tei->fofs, len = tei->len; unsigned int end = fofs + len; - unsigned int pos = (unsigned int)fofs; bool updated = false; bool leftmost = false; if (!et) return; - trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, len, 0); - + if (type == EX_READ) + trace_f2fs_update_read_extent_tree_range(inode, fofs, len, + tei->blk, 0); write_lock(&et->lock); - if (is_inode_flag_set(inode, FI_NO_EXTENT)) { - write_unlock(&et->lock); - return; + if (type == EX_READ) { + if (is_inode_flag_set(inode, FI_NO_EXTENT)) { + write_unlock(&et->lock); + return; + } + + prev = et->largest; + dei.len = 0; + + /* + * drop largest extent before lookup, in case it's already + * been shrunk from extent tree + */ + __drop_largest_extent(et, fofs, len); } - prev = et->largest; - dei.len = 0; - - /* - * drop largest extent before lookup, in case it's already - * been shrunk from extent tree - */ - __drop_largest_extent(et, fofs, len); - /* 1. lookup first extent node in range [fofs, fofs + len - 1] */ en = (struct extent_node *)f2fs_lookup_rb_tree_ret(&et->root, (struct rb_entry *)et->cached_en, fofs, @@ -640,26 +678,30 @@ static void f2fs_update_extent_tree_range(struct inode *inode, dei = en->ei; org_end = dei.fofs + dei.len; - f2fs_bug_on(sbi, pos >= org_end); + f2fs_bug_on(sbi, fofs >= org_end); - if (pos > dei.fofs && pos - dei.fofs >= F2FS_MIN_EXTENT_LEN) { - en->ei.len = pos - en->ei.fofs; + if (fofs > dei.fofs && (type != EX_READ || + fofs - dei.fofs >= F2FS_MIN_EXTENT_LEN)) { + en->ei.len = fofs - en->ei.fofs; prev_en = en; parts = 1; } - if (end < org_end && org_end - end >= F2FS_MIN_EXTENT_LEN) { + if (end < org_end && (type != EX_READ || + org_end - end >= F2FS_MIN_EXTENT_LEN)) { if (parts) { __set_extent_info(&ei, end, org_end - end, - end - dei.fofs + dei.blk, false); + end - dei.fofs + dei.blk, false, + type); en1 = __insert_extent_tree(sbi, et, &ei, NULL, NULL, true); next_en = en1; } else { __set_extent_info(&en->ei, end, en->ei.len - (end - dei.fofs), - en->ei.blk + (end - dei.fofs), true); + en->ei.blk + (end - dei.fofs), true, + type); next_en = en; } parts++; @@ -689,9 +731,11 @@ static void f2fs_update_extent_tree_range(struct inode *inode, en = next_en; } - /* 3. update extent in extent cache */ - if (blkaddr) { - __set_extent_info(&ei, fofs, len, blkaddr, false); + /* 3. update extent in read extent cache */ + BUG_ON(type != EX_READ); + + if (tei->blk) { + __set_extent_info(&ei, fofs, len, tei->blk, false, EX_READ); if (!__try_merge_extent_node(sbi, et, &ei, prev_en, next_en)) __insert_extent_tree(sbi, et, &ei, insert_p, insert_parent, leftmost); @@ -721,19 +765,20 @@ static void f2fs_update_extent_tree_range(struct inode *inode, } #ifdef CONFIG_F2FS_FS_COMPRESSION -void f2fs_update_extent_tree_range_compressed(struct inode *inode, +void f2fs_update_read_extent_tree_range_compressed(struct inode *inode, pgoff_t fofs, block_t blkaddr, unsigned int llen, unsigned int c_len) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct extent_tree *et = F2FS_I(inode)->extent_tree; + struct extent_tree *et = F2FS_I(inode)->extent_tree[EX_READ]; struct extent_node *en = NULL; struct extent_node *prev_en = NULL, *next_en = NULL; struct extent_info ei; struct rb_node **insert_p = NULL, *insert_parent = NULL; bool leftmost = false; - trace_f2fs_update_extent_tree_range(inode, fofs, blkaddr, llen, c_len); + trace_f2fs_update_read_extent_tree_range(inode, fofs, llen, + blkaddr, c_len); /* it is safe here to check FI_NO_EXTENT w/o et->lock in ro image */ if (is_inode_flag_set(inode, FI_NO_EXTENT)) @@ -750,7 +795,7 @@ void f2fs_update_extent_tree_range_compressed(struct inode *inode, if (en) goto unlock_out; - __set_extent_info(&ei, fofs, llen, blkaddr, true); + __set_extent_info(&ei, fofs, llen, blkaddr, true, EX_READ); ei.c_len = c_len; if (!__try_merge_extent_node(sbi, et, &ei, prev_en, next_en)) @@ -761,24 +806,43 @@ unlock_out: } #endif -unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink) +static void __update_extent_cache(struct dnode_of_data *dn, enum extent_type type) { + struct extent_info ei; + + if (!__may_extent_tree(dn->inode, type)) + return; + + ei.fofs = f2fs_start_bidx_of_node(ofs_of_node(dn->node_page), dn->inode) + + dn->ofs_in_node; + ei.len = 1; + + if (type == EX_READ) { + if (dn->data_blkaddr == NEW_ADDR) + ei.blk = NULL_ADDR; + else + ei.blk = dn->data_blkaddr; + } + __update_extent_tree_range(dn->inode, &ei, type); +} + +static unsigned int __shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink, + enum extent_type type) +{ + struct extent_tree_info *eti = &sbi->extent_tree[type]; struct extent_tree *et, *next; struct extent_node *en; unsigned int node_cnt = 0, tree_cnt = 0; int remained; - if (!test_opt(sbi, READ_EXTENT_CACHE)) - return 0; - - if (!atomic_read(&sbi->total_zombie_tree)) + if (!atomic_read(&eti->total_zombie_tree)) goto free_node; - if (!mutex_trylock(&sbi->extent_tree_lock)) + if (!mutex_trylock(&eti->extent_tree_lock)) goto out; /* 1. remove unreferenced extent tree */ - list_for_each_entry_safe(et, next, &sbi->zombie_list, list) { + list_for_each_entry_safe(et, next, &eti->zombie_list, list) { if (atomic_read(&et->node_cnt)) { write_lock(&et->lock); node_cnt += __free_extent_tree(sbi, et); @@ -786,61 +850,100 @@ unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink) } f2fs_bug_on(sbi, atomic_read(&et->node_cnt)); list_del_init(&et->list); - radix_tree_delete(&sbi->extent_tree_root, et->ino); + radix_tree_delete(&eti->extent_tree_root, et->ino); kmem_cache_free(extent_tree_slab, et); - atomic_dec(&sbi->total_ext_tree); - atomic_dec(&sbi->total_zombie_tree); + atomic_dec(&eti->total_ext_tree); + atomic_dec(&eti->total_zombie_tree); tree_cnt++; if (node_cnt + tree_cnt >= nr_shrink) goto unlock_out; cond_resched(); } - mutex_unlock(&sbi->extent_tree_lock); + mutex_unlock(&eti->extent_tree_lock); free_node: /* 2. remove LRU extent entries */ - if (!mutex_trylock(&sbi->extent_tree_lock)) + if (!mutex_trylock(&eti->extent_tree_lock)) goto out; remained = nr_shrink - (node_cnt + tree_cnt); - spin_lock(&sbi->extent_lock); + spin_lock(&eti->extent_lock); for (; remained > 0; remained--) { - if (list_empty(&sbi->extent_list)) + if (list_empty(&eti->extent_list)) break; - en = list_first_entry(&sbi->extent_list, + en = list_first_entry(&eti->extent_list, struct extent_node, list); et = en->et; if (!write_trylock(&et->lock)) { /* refresh this extent node's position in extent list */ - list_move_tail(&en->list, &sbi->extent_list); + list_move_tail(&en->list, &eti->extent_list); continue; } list_del_init(&en->list); - spin_unlock(&sbi->extent_lock); + spin_unlock(&eti->extent_lock); __detach_extent_node(sbi, et, en); write_unlock(&et->lock); node_cnt++; - spin_lock(&sbi->extent_lock); + spin_lock(&eti->extent_lock); } - spin_unlock(&sbi->extent_lock); + spin_unlock(&eti->extent_lock); unlock_out: - mutex_unlock(&sbi->extent_tree_lock); + mutex_unlock(&eti->extent_tree_lock); out: - trace_f2fs_shrink_extent_tree(sbi, node_cnt, tree_cnt); + trace_f2fs_shrink_extent_tree(sbi, node_cnt, tree_cnt, type); return node_cnt + tree_cnt; } -unsigned int f2fs_destroy_extent_node(struct inode *inode) +/* read extent cache operations */ +bool f2fs_lookup_read_extent_cache(struct inode *inode, pgoff_t pgofs, + struct extent_info *ei) +{ + if (!__may_extent_tree(inode, EX_READ)) + return false; + + return __lookup_extent_tree(inode, pgofs, ei, EX_READ); +} + +void f2fs_update_read_extent_cache(struct dnode_of_data *dn) +{ + return __update_extent_cache(dn, EX_READ); +} + +void f2fs_update_read_extent_cache_range(struct dnode_of_data *dn, + pgoff_t fofs, block_t blkaddr, unsigned int len) +{ + struct extent_info ei = { + .fofs = fofs, + .len = len, + .blk = blkaddr, + }; + + if (!__may_extent_tree(dn->inode, EX_READ)) + return; + + __update_extent_tree_range(dn->inode, &ei, EX_READ); +} + +unsigned int f2fs_shrink_read_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink) +{ + if (!test_opt(sbi, READ_EXTENT_CACHE)) + return 0; + + return __shrink_extent_tree(sbi, nr_shrink, EX_READ); +} + +static unsigned int __destroy_extent_node(struct inode *inode, + enum extent_type type) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct extent_tree *et = F2FS_I(inode)->extent_tree; + struct extent_tree *et = F2FS_I(inode)->extent_tree[type]; unsigned int node_cnt = 0; if (!et || !atomic_read(&et->node_cnt)) @@ -853,31 +956,44 @@ unsigned int f2fs_destroy_extent_node(struct inode *inode) return node_cnt; } -void f2fs_drop_extent_tree(struct inode *inode) +void f2fs_destroy_extent_node(struct inode *inode) +{ + __destroy_extent_node(inode, EX_READ); +} + +static void __drop_extent_tree(struct inode *inode, enum extent_type type) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct extent_tree *et = F2FS_I(inode)->extent_tree; + struct extent_tree *et = F2FS_I(inode)->extent_tree[type]; bool updated = false; - if (!f2fs_may_extent_tree(inode)) + if (!__may_extent_tree(inode, type)) return; write_lock(&et->lock); - set_inode_flag(inode, FI_NO_EXTENT); __free_extent_tree(sbi, et); - if (et->largest.len) { - et->largest.len = 0; - updated = true; + if (type == EX_READ) { + set_inode_flag(inode, FI_NO_EXTENT); + if (et->largest.len) { + et->largest.len = 0; + updated = true; + } } write_unlock(&et->lock); if (updated) f2fs_mark_inode_dirty_sync(inode, true); } -void f2fs_destroy_extent_tree(struct inode *inode) +void f2fs_drop_extent_tree(struct inode *inode) +{ + __drop_extent_tree(inode, EX_READ); +} + +static void __destroy_extent_tree(struct inode *inode, enum extent_type type) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct extent_tree *et = F2FS_I(inode)->extent_tree; + struct extent_tree_info *eti = &sbi->extent_tree[type]; + struct extent_tree *et = F2FS_I(inode)->extent_tree[type]; unsigned int node_cnt = 0; if (!et) @@ -885,76 +1001,49 @@ void f2fs_destroy_extent_tree(struct inode *inode) if (inode->i_nlink && !is_bad_inode(inode) && atomic_read(&et->node_cnt)) { - mutex_lock(&sbi->extent_tree_lock); - list_add_tail(&et->list, &sbi->zombie_list); - atomic_inc(&sbi->total_zombie_tree); - mutex_unlock(&sbi->extent_tree_lock); + mutex_lock(&eti->extent_tree_lock); + list_add_tail(&et->list, &eti->zombie_list); + atomic_inc(&eti->total_zombie_tree); + mutex_unlock(&eti->extent_tree_lock); return; } /* free all extent info belong to this extent tree */ - node_cnt = f2fs_destroy_extent_node(inode); + node_cnt = __destroy_extent_node(inode, type); /* delete extent tree entry in radix tree */ - mutex_lock(&sbi->extent_tree_lock); + mutex_lock(&eti->extent_tree_lock); f2fs_bug_on(sbi, atomic_read(&et->node_cnt)); - radix_tree_delete(&sbi->extent_tree_root, inode->i_ino); + radix_tree_delete(&eti->extent_tree_root, inode->i_ino); kmem_cache_free(extent_tree_slab, et); - atomic_dec(&sbi->total_ext_tree); - mutex_unlock(&sbi->extent_tree_lock); + atomic_dec(&eti->total_ext_tree); + mutex_unlock(&eti->extent_tree_lock); - F2FS_I(inode)->extent_tree = NULL; + F2FS_I(inode)->extent_tree[type] = NULL; - trace_f2fs_destroy_extent_tree(inode, node_cnt); + trace_f2fs_destroy_extent_tree(inode, node_cnt, type); } -bool f2fs_lookup_extent_cache(struct inode *inode, pgoff_t pgofs, - struct extent_info *ei) +void f2fs_destroy_extent_tree(struct inode *inode) { - if (!f2fs_may_extent_tree(inode)) - return false; - - return f2fs_lookup_extent_tree(inode, pgofs, ei); + __destroy_extent_tree(inode, EX_READ); } -void f2fs_update_extent_cache(struct dnode_of_data *dn) +static void __init_extent_tree_info(struct extent_tree_info *eti) { - pgoff_t fofs; - block_t blkaddr; - - if (!f2fs_may_extent_tree(dn->inode)) - return; - - if (dn->data_blkaddr == NEW_ADDR) - blkaddr = NULL_ADDR; - else - blkaddr = dn->data_blkaddr; - - fofs = f2fs_start_bidx_of_node(ofs_of_node(dn->node_page), dn->inode) + - dn->ofs_in_node; - f2fs_update_extent_tree_range(dn->inode, fofs, blkaddr, 1); -} - -void f2fs_update_extent_cache_range(struct dnode_of_data *dn, - pgoff_t fofs, block_t blkaddr, unsigned int len) - -{ - if (!f2fs_may_extent_tree(dn->inode)) - return; - - f2fs_update_extent_tree_range(dn->inode, fofs, blkaddr, len); + INIT_RADIX_TREE(&eti->extent_tree_root, GFP_NOIO); + mutex_init(&eti->extent_tree_lock); + INIT_LIST_HEAD(&eti->extent_list); + spin_lock_init(&eti->extent_lock); + atomic_set(&eti->total_ext_tree, 0); + INIT_LIST_HEAD(&eti->zombie_list); + atomic_set(&eti->total_zombie_tree, 0); + atomic_set(&eti->total_ext_node, 0); } void f2fs_init_extent_cache_info(struct f2fs_sb_info *sbi) { - INIT_RADIX_TREE(&sbi->extent_tree_root, GFP_NOIO); - mutex_init(&sbi->extent_tree_lock); - INIT_LIST_HEAD(&sbi->extent_list); - spin_lock_init(&sbi->extent_lock); - atomic_set(&sbi->total_ext_tree, 0); - INIT_LIST_HEAD(&sbi->zombie_list); - atomic_set(&sbi->total_zombie_tree, 0); - atomic_set(&sbi->total_ext_node, 0); + __init_extent_tree_info(&sbi->extent_tree[EX_READ]); } int __init f2fs_create_extent_cache(void) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 3ecb53f2f67f..fff32fc07c2b 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -570,6 +570,12 @@ enum { /* number of extent info in extent cache we try to shrink */ #define READ_EXTENT_CACHE_SHRINK_NUMBER 128 +/* extent cache type */ +enum extent_type { + EX_READ, + NR_EXTENT_CACHES, +}; + struct rb_entry { struct rb_node rb_node; /* rb node located in rb-tree */ union { @@ -584,10 +590,17 @@ struct rb_entry { struct extent_info { unsigned int fofs; /* start offset in a file */ unsigned int len; /* length of the extent */ - block_t blk; /* start block address of the extent */ + union { + /* read extent_cache */ + struct { + /* start block address of the extent */ + block_t blk; #ifdef CONFIG_F2FS_FS_COMPRESSION - unsigned int c_len; /* physical extent length of compressed blocks */ + /* physical extent length of compressed blocks */ + unsigned int c_len; #endif + }; + }; }; struct extent_node { @@ -599,13 +612,25 @@ struct extent_node { struct extent_tree { nid_t ino; /* inode number */ + enum extent_type type; /* keep the extent tree type */ struct rb_root_cached root; /* root of extent info rb-tree */ struct extent_node *cached_en; /* recently accessed extent node */ - struct extent_info largest; /* largested extent info */ struct list_head list; /* to be used by sbi->zombie_list */ rwlock_t lock; /* protect extent info rb-tree */ atomic_t node_cnt; /* # of extent node in rb-tree*/ bool largest_updated; /* largest extent updated */ + struct extent_info largest; /* largest cached extent for EX_READ */ +}; + +struct extent_tree_info { + struct radix_tree_root extent_tree_root;/* cache extent cache entries */ + struct mutex extent_tree_lock; /* locking extent radix tree */ + struct list_head extent_list; /* lru list for shrinker */ + spinlock_t extent_lock; /* locking extent lru list */ + atomic_t total_ext_tree; /* extent tree count */ + struct list_head zombie_list; /* extent zombie tree list */ + atomic_t total_zombie_tree; /* extent zombie tree count */ + atomic_t total_ext_node; /* extent info count */ }; /* @@ -764,7 +789,8 @@ struct f2fs_inode_info { struct list_head inmem_pages; /* inmemory pages managed by f2fs */ struct task_struct *inmem_task; /* store inmemory task */ struct mutex inmem_lock; /* lock for inmemory pages */ - struct extent_tree *extent_tree; /* cached extent_tree entry */ + struct extent_tree *extent_tree[NR_EXTENT_CACHES]; + /* cached extent_tree entry */ /* avoid racing between foreground op and gc */ struct f2fs_rwsem i_gc_rwsem[2]; @@ -1562,14 +1588,7 @@ struct f2fs_sb_info { struct mutex flush_lock; /* for flush exclusion */ /* for extent tree cache */ - struct radix_tree_root extent_tree_root;/* cache extent cache entries */ - struct mutex extent_tree_lock; /* locking extent radix tree */ - struct list_head extent_list; /* lru list for shrinker */ - spinlock_t extent_lock; /* locking extent lru list */ - atomic_t total_ext_tree; /* extent tree count */ - struct list_head zombie_list; /* extent zombie tree list */ - atomic_t total_zombie_tree; /* extent zombie tree count */ - atomic_t total_ext_node; /* extent info count */ + struct extent_tree_info extent_tree[NR_EXTENT_CACHES]; /* basic filesystem units */ unsigned int log_sectors_per_block; /* log2 sectors per block */ @@ -1650,10 +1669,14 @@ struct f2fs_sb_info { unsigned int segment_count[2]; /* # of allocated segments */ unsigned int block_count[2]; /* # of allocated blocks */ atomic_t inplace_count; /* # of inplace update */ - atomic64_t total_hit_ext; /* # of lookup extent cache */ - atomic64_t read_hit_rbtree; /* # of hit rbtree extent node */ - atomic64_t read_hit_largest; /* # of hit largest extent node */ - atomic64_t read_hit_cached; /* # of hit cached extent node */ + /* # of lookup extent cache */ + atomic64_t total_hit_ext[NR_EXTENT_CACHES]; + /* # of hit rbtree extent node */ + atomic64_t read_hit_rbtree[NR_EXTENT_CACHES]; + /* # of hit cached extent node */ + atomic64_t read_hit_cached[NR_EXTENT_CACHES]; + /* # of hit largest extent node in read extent cache */ + atomic64_t read_hit_largest; atomic_t inline_xattr; /* # of inline_xattr inodes */ atomic_t inline_inode; /* # of inline_data inodes */ atomic_t inline_dir; /* # of inline_dentry inodes */ @@ -3736,9 +3759,17 @@ struct f2fs_stat_info { struct f2fs_sb_info *sbi; int all_area_segs, sit_area_segs, nat_area_segs, ssa_area_segs; int main_area_segs, main_area_sections, main_area_zones; - unsigned long long hit_largest, hit_cached, hit_rbtree; - unsigned long long hit_total, total_ext; - int ext_tree, zombie_tree, ext_node; + unsigned long long hit_cached[NR_EXTENT_CACHES]; + unsigned long long hit_rbtree[NR_EXTENT_CACHES]; + unsigned long long total_ext[NR_EXTENT_CACHES]; + unsigned long long hit_total[NR_EXTENT_CACHES]; + int ext_tree[NR_EXTENT_CACHES]; + int zombie_tree[NR_EXTENT_CACHES]; + int ext_node[NR_EXTENT_CACHES]; + /* to count memory footprint */ + unsigned long long ext_mem[NR_EXTENT_CACHES]; + /* for read extent cache */ + unsigned long long hit_largest; int ndirty_node, ndirty_dent, ndirty_meta, ndirty_imeta; int ndirty_data, ndirty_qdata; int inmem_pages; @@ -3799,10 +3830,10 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct f2fs_sb_info *sbi) #define stat_other_skip_bggc_count(sbi) ((sbi)->other_skip_bggc++) #define stat_inc_dirty_inode(sbi, type) ((sbi)->ndirty_inode[type]++) #define stat_dec_dirty_inode(sbi, type) ((sbi)->ndirty_inode[type]--) -#define stat_inc_total_hit(sbi) (atomic64_inc(&(sbi)->total_hit_ext)) -#define stat_inc_rbtree_node_hit(sbi) (atomic64_inc(&(sbi)->read_hit_rbtree)) +#define stat_inc_total_hit(sbi, type) (atomic64_inc(&(sbi)->total_hit_ext[type])) +#define stat_inc_rbtree_node_hit(sbi, type) (atomic64_inc(&(sbi)->read_hit_rbtree[type])) #define stat_inc_largest_node_hit(sbi) (atomic64_inc(&(sbi)->read_hit_largest)) -#define stat_inc_cached_node_hit(sbi) (atomic64_inc(&(sbi)->read_hit_cached)) +#define stat_inc_cached_node_hit(sbi, type) (atomic64_inc(&(sbi)->read_hit_cached[type])) #define stat_inc_inline_xattr(inode) \ do { \ if (f2fs_has_inline_xattr(inode)) \ @@ -3928,10 +3959,10 @@ void f2fs_update_sit_info(struct f2fs_sb_info *sbi); #define stat_other_skip_bggc_count(sbi) do { } while (0) #define stat_inc_dirty_inode(sbi, type) do { } while (0) #define stat_dec_dirty_inode(sbi, type) do { } while (0) -#define stat_inc_total_hit(sbi) do { } while (0) -#define stat_inc_rbtree_node_hit(sbi) do { } while (0) +#define stat_inc_total_hit(sbi, type) do { } while (0) +#define stat_inc_rbtree_node_hit(sbi, type) do { } while (0) #define stat_inc_largest_node_hit(sbi) do { } while (0) -#define stat_inc_cached_node_hit(sbi) do { } while (0) +#define stat_inc_cached_node_hit(sbi, type) do { } while (0) #define stat_inc_inline_xattr(inode) do { } while (0) #define stat_dec_inline_xattr(inode) do { } while (0) #define stat_inc_inline_inode(inode) do { } while (0) @@ -4036,20 +4067,23 @@ struct rb_entry *f2fs_lookup_rb_tree_ret(struct rb_root_cached *root, bool force, bool *leftmost); bool f2fs_check_rb_tree_consistence(struct f2fs_sb_info *sbi, struct rb_root_cached *root, bool check_key); -unsigned int f2fs_shrink_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink); void f2fs_init_extent_tree(struct inode *inode, struct page *ipage); void f2fs_drop_extent_tree(struct inode *inode); -unsigned int f2fs_destroy_extent_node(struct inode *inode); +void f2fs_destroy_extent_node(struct inode *inode); void f2fs_destroy_extent_tree(struct inode *inode); -bool f2fs_lookup_extent_cache(struct inode *inode, pgoff_t pgofs, - struct extent_info *ei); -void f2fs_update_extent_cache(struct dnode_of_data *dn); -void f2fs_update_extent_cache_range(struct dnode_of_data *dn, - pgoff_t fofs, block_t blkaddr, unsigned int len); void f2fs_init_extent_cache_info(struct f2fs_sb_info *sbi); int __init f2fs_create_extent_cache(void); void f2fs_destroy_extent_cache(void); +/* read extent cache ops */ +bool f2fs_lookup_read_extent_cache(struct inode *inode, pgoff_t pgofs, + struct extent_info *ei); +void f2fs_update_read_extent_cache(struct dnode_of_data *dn); +void f2fs_update_read_extent_cache_range(struct dnode_of_data *dn, + pgoff_t fofs, block_t blkaddr, unsigned int len); +unsigned int f2fs_shrink_read_extent_tree(struct f2fs_sb_info *sbi, + int nr_shrink); + /* * sysfs.c */ @@ -4113,9 +4147,9 @@ int f2fs_write_multi_pages(struct compress_ctx *cc, struct writeback_control *wbc, enum iostat_type io_type); int f2fs_is_compressed_cluster(struct inode *inode, pgoff_t index); -void f2fs_update_extent_tree_range_compressed(struct inode *inode, - pgoff_t fofs, block_t blkaddr, unsigned int llen, - unsigned int c_len); +void f2fs_update_read_extent_tree_range_compressed(struct inode *inode, + pgoff_t fofs, block_t blkaddr, + unsigned int llen, unsigned int c_len); int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, unsigned nr_pages, sector_t *last_block_in_bio, bool is_readahead, bool for_write); @@ -4193,9 +4227,10 @@ static inline bool f2fs_load_compressed_page(struct f2fs_sb_info *sbi, static inline void f2fs_invalidate_compress_pages(struct f2fs_sb_info *sbi, nid_t ino) { } #define inc_compr_inode_stat(inode) do { } while (0) -static inline void f2fs_update_extent_tree_range_compressed(struct inode *inode, - pgoff_t fofs, block_t blkaddr, unsigned int llen, - unsigned int c_len) { } +static inline void f2fs_update_read_extent_tree_range_compressed( + struct inode *inode, + pgoff_t fofs, block_t blkaddr, + unsigned int llen, unsigned int c_len) { } #endif static inline int set_compress_context(struct inode *inode) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 36d8f0376f76..18da58068820 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -607,7 +607,7 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count) */ fofs = f2fs_start_bidx_of_node(ofs_of_node(dn->node_page), dn->inode) + ofs; - f2fs_update_extent_cache_range(dn, fofs, 0, len); + f2fs_update_read_extent_cache_range(dn, fofs, 0, len); dec_valid_block_count(sbi, dn->inode, nr_free); } dn->ofs_in_node = ofs; @@ -1430,7 +1430,7 @@ static int f2fs_do_zero_range(struct dnode_of_data *dn, pgoff_t start, f2fs_set_data_blkaddr(dn); } - f2fs_update_extent_cache_range(dn, start, 0, index - start); + f2fs_update_read_extent_cache_range(dn, start, 0, index - start); return ret; } @@ -2590,7 +2590,7 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, struct f2fs_map_blocks map = { .m_next_extent = NULL, .m_seg_type = NO_CHECK_TYPE, .m_may_create = false }; - struct extent_info ei = {0, 0, 0}; + struct extent_info ei = {0, }; pgoff_t pg_start, pg_end, next_pgofs; unsigned int blk_per_seg = sbi->blocks_per_seg; unsigned int total = 0, sec_num; @@ -2622,7 +2622,7 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, * lookup mapping info in extent cache, skip defragmenting if physical * block addresses are continuous. */ - if (f2fs_lookup_extent_cache(inode, pg_start, &ei)) { + if (f2fs_lookup_read_extent_cache(inode, pg_start, &ei)) { if (ei.fofs + ei.len >= pg_end) goto out; } diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index 30949eac81c3..fe9ea12e0a5f 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -1054,7 +1054,7 @@ static int ra_data_block(struct inode *inode, pgoff_t index) struct address_space *mapping = inode->i_mapping; struct dnode_of_data dn; struct page *page; - struct extent_info ei = {0, 0, 0}; + struct extent_info ei = {0, }; struct f2fs_io_info fio = { .sbi = sbi, .ino = inode->i_ino, @@ -1072,7 +1072,7 @@ static int ra_data_block(struct inode *inode, pgoff_t index) if (!page) return -ENOMEM; - if (f2fs_lookup_extent_cache(inode, index, &ei)) { + if (f2fs_lookup_read_extent_cache(inode, index, &ei)) { dn.data_blkaddr = ei.blk + index - ei.fofs; if (unlikely(!f2fs_is_valid_blkaddr(sbi, dn.data_blkaddr, DATA_GENERIC_ENHANCE_READ))) { diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 0b54b678137f..cedad8613f89 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -260,8 +260,8 @@ static bool sanity_check_inode(struct inode *inode, struct page *node_page) return false; } - if (F2FS_I(inode)->extent_tree) { - struct extent_info *ei = &F2FS_I(inode)->extent_tree->largest; + if (fi->extent_tree[EX_READ]) { + struct extent_info *ei = &fi->extent_tree[EX_READ]->largest; if (ei->len && (!f2fs_is_valid_blkaddr(sbi, ei->blk, @@ -571,7 +571,7 @@ retry: void f2fs_update_inode(struct inode *inode, struct page *node_page) { struct f2fs_inode *ri; - struct extent_tree *et = F2FS_I(inode)->extent_tree; + struct extent_tree *et = F2FS_I(inode)->extent_tree[EX_READ]; f2fs_wait_on_page_writeback(node_page, NODE, true, true); set_page_dirty(node_page); diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index a55b3676233c..4e5cbd856df7 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -84,9 +84,11 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type) mem_size >>= PAGE_SHIFT; res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1); } else if (type == READ_EXTENT_CACHE) { - mem_size = (atomic_read(&sbi->total_ext_tree) * + struct extent_tree_info *eti = &sbi->extent_tree[EX_READ]; + + mem_size = (atomic_read(&eti->total_ext_tree) * sizeof(struct extent_tree) + - atomic_read(&sbi->total_ext_node) * + atomic_read(&eti->total_ext_node) * sizeof(struct extent_node)) >> PAGE_SHIFT; res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1); } else if (type == INMEM_PAGES) { @@ -860,7 +862,7 @@ int f2fs_get_dnode_of_data(struct dnode_of_data *dn, pgoff_t index, int mode) blkaddr = data_blkaddr(dn->inode, dn->node_page, dn->ofs_in_node + 1); - f2fs_update_extent_tree_range_compressed(dn->inode, + f2fs_update_read_extent_tree_range_compressed(dn->inode, index, blkaddr, F2FS_I(dn->inode)->i_cluster_size, c_len); diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index ef599fbab740..512781ad3e61 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -537,7 +537,8 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi, bool from_bg) /* try to shrink extent cache when there is no enough memory */ if (!f2fs_available_free_memory(sbi, READ_EXTENT_CACHE)) - f2fs_shrink_extent_tree(sbi, READ_EXTENT_CACHE_SHRINK_NUMBER); + f2fs_shrink_read_extent_tree(sbi, + READ_EXTENT_CACHE_SHRINK_NUMBER); /* check the # of cached NAT entries */ if (!f2fs_available_free_memory(sbi, NAT_ENTRIES)) diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c index dd3c3c7a90ec..33c490e69ae3 100644 --- a/fs/f2fs/shrinker.c +++ b/fs/f2fs/shrinker.c @@ -28,10 +28,13 @@ static unsigned long __count_free_nids(struct f2fs_sb_info *sbi) return count > 0 ? count : 0; } -static unsigned long __count_extent_cache(struct f2fs_sb_info *sbi) +static unsigned long __count_extent_cache(struct f2fs_sb_info *sbi, + enum extent_type type) { - return atomic_read(&sbi->total_zombie_tree) + - atomic_read(&sbi->total_ext_node); + struct extent_tree_info *eti = &sbi->extent_tree[type]; + + return atomic_read(&eti->total_zombie_tree) + + atomic_read(&eti->total_ext_node); } unsigned long f2fs_shrink_count(struct shrinker *shrink, @@ -53,8 +56,8 @@ unsigned long f2fs_shrink_count(struct shrinker *shrink, } spin_unlock(&f2fs_list_lock); - /* count extent cache entries */ - count += __count_extent_cache(sbi); + /* count read extent cache entries */ + count += __count_extent_cache(sbi, EX_READ); /* count clean nat cache entries */ count += __count_nat_entries(sbi); @@ -99,8 +102,8 @@ unsigned long f2fs_shrink_scan(struct shrinker *shrink, sbi->shrinker_run_no = run_no; - /* shrink extent cache entries */ - freed += f2fs_shrink_extent_tree(sbi, nr >> 1); + /* shrink read extent cache entries */ + freed += f2fs_shrink_read_extent_tree(sbi, nr >> 1); /* shrink clean nat cache entries */ if (freed < nr) @@ -130,7 +133,7 @@ void f2fs_join_shrinker(struct f2fs_sb_info *sbi) void f2fs_leave_shrinker(struct f2fs_sb_info *sbi) { - f2fs_shrink_extent_tree(sbi, __count_extent_cache(sbi)); + f2fs_shrink_read_extent_tree(sbi, __count_extent_cache(sbi, EX_READ)); spin_lock(&f2fs_list_lock); list_del_init(&sbi->s_list); diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index 02c1ccc4f925..81cb234cfaf6 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -52,6 +52,7 @@ TRACE_DEFINE_ENUM(CP_DISCARD); TRACE_DEFINE_ENUM(CP_TRIMMED); TRACE_DEFINE_ENUM(CP_PAUSE); TRACE_DEFINE_ENUM(CP_RESIZE); +TRACE_DEFINE_ENUM(EX_READ); #define show_block_type(type) \ __print_symbolic(type, \ @@ -1526,28 +1527,31 @@ TRACE_EVENT(f2fs_issue_flush, TRACE_EVENT(f2fs_lookup_extent_tree_start, - TP_PROTO(struct inode *inode, unsigned int pgofs), + TP_PROTO(struct inode *inode, unsigned int pgofs, enum extent_type type), - TP_ARGS(inode, pgofs), + TP_ARGS(inode, pgofs, type), TP_STRUCT__entry( __field(dev_t, dev) __field(ino_t, ino) __field(unsigned int, pgofs) + __field(enum extent_type, type) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->pgofs = pgofs; + __entry->type = type; ), - TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u", + TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, type = %s", show_dev_ino(__entry), - __entry->pgofs) + __entry->pgofs, + __entry->type == EX_READ ? "Read" : "N/A") ); -TRACE_EVENT_CONDITION(f2fs_lookup_extent_tree_end, +TRACE_EVENT_CONDITION(f2fs_lookup_read_extent_tree_end, TP_PROTO(struct inode *inode, unsigned int pgofs, struct extent_info *ei), @@ -1561,8 +1565,8 @@ TRACE_EVENT_CONDITION(f2fs_lookup_extent_tree_end, __field(ino_t, ino) __field(unsigned int, pgofs) __field(unsigned int, fofs) - __field(u32, blk) __field(unsigned int, len) + __field(u32, blk) ), TP_fast_assign( @@ -1570,26 +1574,26 @@ TRACE_EVENT_CONDITION(f2fs_lookup_extent_tree_end, __entry->ino = inode->i_ino; __entry->pgofs = pgofs; __entry->fofs = ei->fofs; - __entry->blk = ei->blk; __entry->len = ei->len; + __entry->blk = ei->blk; ), TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, " - "ext_info(fofs: %u, blk: %u, len: %u)", + "read_ext_info(fofs: %u, len: %u, blk: %u)", show_dev_ino(__entry), __entry->pgofs, __entry->fofs, - __entry->blk, - __entry->len) + __entry->len, + __entry->blk) ); -TRACE_EVENT(f2fs_update_extent_tree_range, +TRACE_EVENT(f2fs_update_read_extent_tree_range, - TP_PROTO(struct inode *inode, unsigned int pgofs, block_t blkaddr, - unsigned int len, + TP_PROTO(struct inode *inode, unsigned int pgofs, unsigned int len, + block_t blkaddr, unsigned int c_len), - TP_ARGS(inode, pgofs, blkaddr, len, c_len), + TP_ARGS(inode, pgofs, len, blkaddr, c_len), TP_STRUCT__entry( __field(dev_t, dev) @@ -1604,67 +1608,73 @@ TRACE_EVENT(f2fs_update_extent_tree_range, __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->pgofs = pgofs; - __entry->blk = blkaddr; __entry->len = len; + __entry->blk = blkaddr; __entry->c_len = c_len; ), TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, " - "blkaddr = %u, len = %u, " - "c_len = %u", + "len = %u, blkaddr = %u, c_len = %u", show_dev_ino(__entry), __entry->pgofs, - __entry->blk, __entry->len, + __entry->blk, __entry->c_len) ); TRACE_EVENT(f2fs_shrink_extent_tree, TP_PROTO(struct f2fs_sb_info *sbi, unsigned int node_cnt, - unsigned int tree_cnt), + unsigned int tree_cnt, enum extent_type type), - TP_ARGS(sbi, node_cnt, tree_cnt), + TP_ARGS(sbi, node_cnt, tree_cnt, type), TP_STRUCT__entry( __field(dev_t, dev) __field(unsigned int, node_cnt) __field(unsigned int, tree_cnt) + __field(enum extent_type, type) ), TP_fast_assign( __entry->dev = sbi->sb->s_dev; __entry->node_cnt = node_cnt; __entry->tree_cnt = tree_cnt; + __entry->type = type; ), - TP_printk("dev = (%d,%d), shrunk: node_cnt = %u, tree_cnt = %u", + TP_printk("dev = (%d,%d), shrunk: node_cnt = %u, tree_cnt = %u, type = %s", show_dev(__entry->dev), __entry->node_cnt, - __entry->tree_cnt) + __entry->tree_cnt, + __entry->type == EX_READ ? "Read" : "N/A") ); TRACE_EVENT(f2fs_destroy_extent_tree, - TP_PROTO(struct inode *inode, unsigned int node_cnt), + TP_PROTO(struct inode *inode, unsigned int node_cnt, + enum extent_type type), - TP_ARGS(inode, node_cnt), + TP_ARGS(inode, node_cnt, type), TP_STRUCT__entry( __field(dev_t, dev) __field(ino_t, ino) __field(unsigned int, node_cnt) + __field(enum extent_type, type) ), TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->node_cnt = node_cnt; + __entry->type = type; ), - TP_printk("dev = (%d,%d), ino = %lu, destroyed: node_cnt = %u", + TP_printk("dev = (%d,%d), ino = %lu, destroyed: node_cnt = %u, type = %s", show_dev_ino(__entry), - __entry->node_cnt) + __entry->node_cnt, + __entry->type == EX_READ ? "Read" : "N/A") ); DECLARE_EVENT_CLASS(f2fs_sync_dirty_inodes, From d6ba4dceab763d99617760d8f0b1cd05451d675c Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Fri, 2 Dec 2022 13:51:09 -0800 Subject: [PATCH 43/51] BACKPORT: f2fs: allocate the extent_cache by default Let's allocate it to remove the runtime complexity. Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit 693658e0c0eca25087c1ffb318d85f8d392d6d27) Change-Id: Ib46b6edd4f4a6232f3451498ee5c2f246dd37682 --- fs/f2fs/extent_cache.c | 38 +++++++++++++++++++------------------- fs/f2fs/f2fs.h | 3 ++- fs/f2fs/inode.c | 6 ++++-- fs/f2fs/namei.c | 4 ++-- 4 files changed, 27 insertions(+), 24 deletions(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index e2ef1b99511b..73616ba7ea73 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -47,20 +47,23 @@ static bool __may_read_extent_tree(struct inode *inode) return S_ISREG(inode->i_mode); } +static bool __init_may_extent_tree(struct inode *inode, enum extent_type type) +{ + if (type == EX_READ) + return __may_read_extent_tree(inode); + return false; +} + static bool __may_extent_tree(struct inode *inode, enum extent_type type) { - struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - /* * for recovered files during mount do not create extents * if shrinker is not registered. */ - if (list_empty(&sbi->s_list)) + if (list_empty(&F2FS_I_SB(inode)->s_list)) return false; - if (type == EX_READ) - return __may_read_extent_tree(inode); - return false; + return __init_may_extent_tree(inode, type); } static void __try_update_largest_extent(struct extent_tree *et, @@ -438,20 +441,18 @@ static void __drop_largest_extent(struct extent_tree *et, } } -/* return true, if inode page is changed */ -static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage, - enum extent_type type) +void f2fs_init_read_extent_tree(struct inode *inode, struct page *ipage) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct extent_tree_info *eti = &sbi->extent_tree[type]; - struct f2fs_extent *i_ext = ipage ? &F2FS_INODE(ipage)->i_ext : NULL; + struct extent_tree_info *eti = &sbi->extent_tree[EX_READ]; + struct f2fs_extent *i_ext = &F2FS_INODE(ipage)->i_ext; struct extent_tree *et; struct extent_node *en; struct extent_info ei; - if (!__may_extent_tree(inode, type)) { + if (!__may_extent_tree(inode, EX_READ)) { /* drop largest read extent */ - if (type == EX_READ && i_ext && i_ext->len) { + if (i_ext && i_ext->len) { f2fs_wait_on_page_writeback(ipage, NODE, true, true); i_ext->len = 0; set_page_dirty(ipage); @@ -459,13 +460,11 @@ static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage, goto out; } - et = __grab_extent_tree(inode, type); + et = __grab_extent_tree(inode, EX_READ); if (!i_ext || !i_ext->len) goto out; - BUG_ON(type != EX_READ); - get_read_extent_info(&ei, i_ext); write_lock(&et->lock); @@ -485,14 +484,15 @@ static void __f2fs_init_extent_tree(struct inode *inode, struct page *ipage, unlock_out: write_unlock(&et->lock); out: - if (type == EX_READ && !F2FS_I(inode)->extent_tree[EX_READ]) + if (!F2FS_I(inode)->extent_tree[EX_READ]) set_inode_flag(inode, FI_NO_EXTENT); } -void f2fs_init_extent_tree(struct inode *inode, struct page *ipage) +void f2fs_init_extent_tree(struct inode *inode) { /* initialize read cache */ - __f2fs_init_extent_tree(inode, ipage, EX_READ); + if (__init_may_extent_tree(inode, EX_READ)) + __grab_extent_tree(inode, EX_READ); } static bool __lookup_extent_tree(struct inode *inode, pgoff_t pgofs, diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index fff32fc07c2b..3dc413423be1 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -4067,7 +4067,7 @@ struct rb_entry *f2fs_lookup_rb_tree_ret(struct rb_root_cached *root, bool force, bool *leftmost); bool f2fs_check_rb_tree_consistence(struct f2fs_sb_info *sbi, struct rb_root_cached *root, bool check_key); -void f2fs_init_extent_tree(struct inode *inode, struct page *ipage); +void f2fs_init_extent_tree(struct inode *inode); void f2fs_drop_extent_tree(struct inode *inode); void f2fs_destroy_extent_node(struct inode *inode); void f2fs_destroy_extent_tree(struct inode *inode); @@ -4076,6 +4076,7 @@ int __init f2fs_create_extent_cache(void); void f2fs_destroy_extent_cache(void); /* read extent cache ops */ +void f2fs_init_read_extent_tree(struct inode *inode, struct page *ipage); bool f2fs_lookup_read_extent_cache(struct inode *inode, pgoff_t pgofs, struct extent_info *ei); void f2fs_update_read_extent_cache(struct dnode_of_data *dn); diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index cedad8613f89..935bcb160602 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -380,8 +380,6 @@ static int do_read_inode(struct inode *inode) fi->i_pino = le32_to_cpu(ri->i_pino); fi->i_dir_level = ri->i_dir_level; - f2fs_init_extent_tree(inode, node_page); - get_inline_info(inode, ri); fi->i_extra_isize = f2fs_has_extra_attr(inode) ? @@ -469,6 +467,10 @@ static int do_read_inode(struct inode *inode) F2FS_I(inode)->i_disk_time[1] = inode->i_ctime; F2FS_I(inode)->i_disk_time[2] = inode->i_mtime; F2FS_I(inode)->i_disk_time[3] = F2FS_I(inode)->i_crtime; + + /* Need all the flag bits */ + f2fs_init_read_extent_tree(inode, node_page); + f2fs_put_page(node_page, 1); stat_inc_inline_xattr(inode); diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c index c0ca487c6a16..14b2eb8a65e0 100644 --- a/fs/f2fs/namei.c +++ b/fs/f2fs/namei.c @@ -105,8 +105,6 @@ static struct inode *f2fs_new_inode(struct inode *dir, umode_t mode) } F2FS_I(inode)->i_inline_xattr_size = xattr_size; - f2fs_init_extent_tree(inode, NULL); - F2FS_I(inode)->i_flags = f2fs_mask_flags(mode, F2FS_I(dir)->i_flags & F2FS_FL_INHERITED); @@ -133,6 +131,8 @@ static struct inode *f2fs_new_inode(struct inode *dir, umode_t mode) f2fs_set_inode_flags(inode); + f2fs_init_extent_tree(inode); + trace_f2fs_new_inode(inode, 0); return inode; From aa064914fdc05cae8b7aebb76ed799a53d9bcac1 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Thu, 1 Dec 2022 17:37:15 -0800 Subject: [PATCH 44/51] BACKPORT: f2fs: add block_age-based extent cache This patch introduces a runtime hot/cold data separation method for f2fs, in order to improve the accuracy for data temperature classification, reduce the garbage collection overhead after long-term data updates. Enhanced hot/cold data separation can record data block update frequency as "age" of the extent per inode, and take use of the age info to indicate better temperature type for data block allocation: - It records total data blocks allocated since mount; - When file extent has been updated, it calculate the count of data blocks allocated since last update as the age of the extent; - Before the data block allocated, it searches for the age info and chooses the suitable segment for allocation. Test and result: - Prepare: create about 30000 files * 3% for cold files (with cold file extension like .apk, from 3M to 10M) * 50% for warm files (with random file extension like .FcDxq, from 1K to 4M) * 47% for hot files (with hot file extension like .db, from 1K to 256K) - create(5%)/random update(90%)/delete(5%) the files * total write amount is about 70G * fsync will be called for .db files, and buffered write will be used for other files The storage of test device is large enough(128G) so that it will not switch to SSR mode during the test. Benefit: dirty segment count increment reduce about 14% - before: Dirty +21110 - after: Dirty +18286 Bug: 264453689 Signed-off-by: qixiaoyu1 Signed-off-by: xiongping1 Signed-off-by: Jaegeuk Kim (cherry picked from commit 729055d7f1e665c57c1c90b093501cb3eb47a876) Change-Id: I8a62846bd3d44f7243300fa9653dbb623b46a96c --- Documentation/ABI/testing/sysfs-fs-f2fs | 14 ++ Documentation/filesystems/f2fs.rst | 4 + fs/f2fs/debug.c | 21 +++ fs/f2fs/extent_cache.c | 183 +++++++++++++++++++++++- fs/f2fs/f2fs.h | 38 +++++ fs/f2fs/file.c | 1 + fs/f2fs/inode.c | 1 + fs/f2fs/node.c | 10 +- fs/f2fs/node.h | 1 + fs/f2fs/segment.c | 33 +++++ fs/f2fs/shrinker.c | 10 +- fs/f2fs/super.c | 14 ++ fs/f2fs/sysfs.c | 24 ++++ include/trace/events/f2fs.h | 86 ++++++++++- 14 files changed, 430 insertions(+), 10 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 163d38c0ba64..9c7ddb4d331f 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -514,3 +514,17 @@ Date: July 2021 Contact: "Daeho Jeong" Description: You can control for which gc mode the "gc_reclaimed_segments" node shows. Refer to the description of the modes in "gc_reclaimed_segments". + +What: /sys/fs/f2fs//hot_data_age_threshold +Date: November 2022 +Contact: "Ping Xiong" +Description: When DATA SEPARATION is on, it controls the age threshold to indicate + the data blocks as hot. By default it was initialized as 262144 blocks + (equals to 1GB). + +What: /sys/fs/f2fs//warm_data_age_threshold +Date: November 2022 +Contact: "Ping Xiong" +Description: When DATA SEPARATION is on, it controls the age threshold to indicate + the data blocks as warm. By default it was initialized as 2621440 blocks + (equals to 10GB). diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst index b91e5a8444d5..f122a2b7c7d1 100644 --- a/Documentation/filesystems/f2fs.rst +++ b/Documentation/filesystems/f2fs.rst @@ -300,6 +300,10 @@ inlinecrypt When possible, encrypt/decrypt the contents of encrypted Documentation/block/inline-encryption.rst. atgc Enable age-threshold garbage collection, it provides high effectiveness and efficiency on background GC. +age_extent_cache Enable an age extent cache based on rb-tree. It records + data block update frequency of the extent per inode, in + order to provide better temperature hints for data block + allocation. ======================== ============================================================ Debugfs Entries diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c index 75564077f2e9..f33e64acaf2b 100644 --- a/fs/f2fs/debug.c +++ b/fs/f2fs/debug.c @@ -88,6 +88,9 @@ static void update_general_status(struct f2fs_sb_info *sbi) si->hit_largest = atomic64_read(&sbi->read_hit_largest); si->hit_total[EX_READ] += si->hit_largest; + /* block age extent_cache only */ + si->allocated_data_blocks = atomic64_read(&sbi->allocated_data_blocks); + /* validation check of the segment numbers */ si->ndirty_node = get_pages(sbi, F2FS_DIRTY_NODES); si->ndirty_dent = get_pages(sbi, F2FS_DIRTY_DENTS); @@ -497,6 +500,22 @@ static int stat_show(struct seq_file *s, void *v) seq_printf(s, " - Inner Struct Count: tree: %d(%d), node: %d\n", si->ext_tree[EX_READ], si->zombie_tree[EX_READ], si->ext_node[EX_READ]); + seq_puts(s, "\nExtent Cache (Block Age):\n"); + seq_printf(s, " - Allocated Data Blocks: %llu\n", + si->allocated_data_blocks); + seq_printf(s, " - Hit Count: L1:%llu L2:%llu\n", + si->hit_cached[EX_BLOCK_AGE], + si->hit_rbtree[EX_BLOCK_AGE]); + seq_printf(s, " - Hit Ratio: %llu%% (%llu / %llu)\n", + !si->total_ext[EX_BLOCK_AGE] ? 0 : + div64_u64(si->hit_total[EX_BLOCK_AGE] * 100, + si->total_ext[EX_BLOCK_AGE]), + si->hit_total[EX_BLOCK_AGE], + si->total_ext[EX_BLOCK_AGE]); + seq_printf(s, " - Inner Struct Count: tree: %d(%d), node: %d\n", + si->ext_tree[EX_BLOCK_AGE], + si->zombie_tree[EX_BLOCK_AGE], + si->ext_node[EX_BLOCK_AGE]); seq_puts(s, "\nBalancing F2FS Async:\n"); seq_printf(s, " - DIO (R: %4d, W: %4d)\n", si->nr_dio_read, si->nr_dio_write); @@ -566,6 +585,8 @@ static int stat_show(struct seq_file *s, void *v) si->cache_mem >> 10); seq_printf(s, " - read extent cache: %llu KB\n", si->ext_mem[EX_READ] >> 10); + seq_printf(s, " - block age extent cache: %llu KB\n", + si->ext_mem[EX_BLOCK_AGE] >> 10); seq_printf(s, " - paged : %llu KB\n", si->page_mem >> 10); } diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index 73616ba7ea73..9a6db0b1b0eb 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -6,6 +6,10 @@ * Copyright (c) 2015 Samsung Electronics * Authors: Jaegeuk Kim * Chao Yu + * + * block_age-based extent cache added by: + * Copyright (c) 2022 xiaomi Co., Ltd. + * http://www.xiaomi.com/ */ #include @@ -18,6 +22,7 @@ static void __set_extent_info(struct extent_info *ei, unsigned int fofs, unsigned int len, block_t blk, bool keep_clen, + unsigned long age, unsigned long last_blocks, enum extent_type type) { ei->fofs = fofs; @@ -30,6 +35,9 @@ static void __set_extent_info(struct extent_info *ei, #ifdef CONFIG_F2FS_FS_COMPRESSION ei->c_len = 0; #endif + } else if (type == EX_BLOCK_AGE) { + ei->age = age; + ei->last_blocks = last_blocks; } } @@ -47,10 +55,27 @@ static bool __may_read_extent_tree(struct inode *inode) return S_ISREG(inode->i_mode); } +static bool __may_age_extent_tree(struct inode *inode) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + + if (!test_opt(sbi, AGE_EXTENT_CACHE)) + return false; + /* don't cache block age info for cold file */ + if (is_inode_flag_set(inode, FI_COMPRESSED_FILE)) + return false; + if (file_is_cold(inode)) + return false; + + return S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode); +} + static bool __init_may_extent_tree(struct inode *inode, enum extent_type type) { if (type == EX_READ) return __may_read_extent_tree(inode); + else if (type == EX_BLOCK_AGE) + return __may_age_extent_tree(inode); return false; } @@ -90,6 +115,11 @@ static bool __is_extent_mergeable(struct extent_info *back, #endif return (back->fofs + back->len == front->fofs && back->blk + back->len == front->blk); + } else if (type == EX_BLOCK_AGE) { + return (back->fofs + back->len == front->fofs && + abs(back->age - front->age) <= SAME_AGE_REGION && + abs(back->last_blocks - front->last_blocks) <= + SAME_AGE_REGION); } return false; } @@ -488,11 +518,22 @@ out: set_inode_flag(inode, FI_NO_EXTENT); } +void f2fs_init_age_extent_tree(struct inode *inode) +{ + if (!__init_may_extent_tree(inode, EX_BLOCK_AGE)) + return; + __grab_extent_tree(inode, EX_BLOCK_AGE); +} + void f2fs_init_extent_tree(struct inode *inode) { /* initialize read cache */ if (__init_may_extent_tree(inode, EX_READ)) __grab_extent_tree(inode, EX_READ); + + /* initialize block age cache */ + if (__init_may_extent_tree(inode, EX_BLOCK_AGE)) + __grab_extent_tree(inode, EX_BLOCK_AGE); } static bool __lookup_extent_tree(struct inode *inode, pgoff_t pgofs, @@ -543,6 +584,8 @@ out: if (type == EX_READ) trace_f2fs_lookup_read_extent_tree_end(inode, pgofs, ei); + else if (type == EX_BLOCK_AGE) + trace_f2fs_lookup_age_extent_tree_end(inode, pgofs, ei); return ret; } @@ -641,6 +684,10 @@ static void __update_extent_tree_range(struct inode *inode, if (type == EX_READ) trace_f2fs_update_read_extent_tree_range(inode, fofs, len, tei->blk, 0); + else if (type == EX_BLOCK_AGE) + trace_f2fs_update_age_extent_tree_range(inode, fofs, len, + tei->age, tei->last_blocks); + write_lock(&et->lock); if (type == EX_READ) { @@ -693,6 +740,7 @@ static void __update_extent_tree_range(struct inode *inode, __set_extent_info(&ei, end, org_end - end, end - dei.fofs + dei.blk, false, + dei.age, dei.last_blocks, type); en1 = __insert_extent_tree(sbi, et, &ei, NULL, NULL, true); @@ -701,6 +749,7 @@ static void __update_extent_tree_range(struct inode *inode, __set_extent_info(&en->ei, end, en->ei.len - (end - dei.fofs), en->ei.blk + (end - dei.fofs), true, + dei.age, dei.last_blocks, type); next_en = en; } @@ -731,11 +780,15 @@ static void __update_extent_tree_range(struct inode *inode, en = next_en; } + if (type == EX_BLOCK_AGE) + goto update_age_extent_cache; + /* 3. update extent in read extent cache */ BUG_ON(type != EX_READ); if (tei->blk) { - __set_extent_info(&ei, fofs, len, tei->blk, false, EX_READ); + __set_extent_info(&ei, fofs, len, tei->blk, false, + 0, 0, EX_READ); if (!__try_merge_extent_node(sbi, et, &ei, prev_en, next_en)) __insert_extent_tree(sbi, et, &ei, insert_p, insert_parent, leftmost); @@ -757,7 +810,17 @@ static void __update_extent_tree_range(struct inode *inode, et->largest_updated = false; updated = true; } + goto out_read_extent_cache; +update_age_extent_cache: + if (!tei->last_blocks) + goto out_read_extent_cache; + __set_extent_info(&ei, fofs, len, 0, false, + tei->age, tei->last_blocks, EX_BLOCK_AGE); + if (!__try_merge_extent_node(sbi, et, &ei, prev_en, next_en)) + __insert_extent_tree(sbi, et, &ei, + insert_p, insert_parent, leftmost); +out_read_extent_cache: write_unlock(&et->lock); if (updated) @@ -795,7 +858,7 @@ void f2fs_update_read_extent_tree_range_compressed(struct inode *inode, if (en) goto unlock_out; - __set_extent_info(&ei, fofs, llen, blkaddr, true, EX_READ); + __set_extent_info(&ei, fofs, llen, blkaddr, true, 0, 0, EX_READ); ei.c_len = c_len; if (!__try_merge_extent_node(sbi, et, &ei, prev_en, next_en)) @@ -806,6 +869,72 @@ unlock_out: } #endif +static unsigned long long __calculate_block_age(unsigned long long new, + unsigned long long old) +{ + unsigned long long diff; + + diff = (new >= old) ? new - (new - old) : new + (old - new); + + return div_u64(diff * LAST_AGE_WEIGHT, 100); +} + +/* This returns a new age and allocated blocks in ei */ +static int __get_new_block_age(struct inode *inode, struct extent_info *ei) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + loff_t f_size = i_size_read(inode); + unsigned long long cur_blocks = + atomic64_read(&sbi->allocated_data_blocks); + + /* + * When I/O is not aligned to a PAGE_SIZE, update will happen to the last + * file block even in seq write. So don't record age for newly last file + * block here. + */ + if ((f_size >> PAGE_SHIFT) == ei->fofs && f_size & (PAGE_SIZE - 1) && + ei->blk == NEW_ADDR) + return -EINVAL; + + if (__lookup_extent_tree(inode, ei->fofs, ei, EX_BLOCK_AGE)) { + unsigned long long cur_age; + + if (cur_blocks >= ei->last_blocks) + cur_age = cur_blocks - ei->last_blocks; + else + /* allocated_data_blocks overflow */ + cur_age = ULLONG_MAX - ei->last_blocks + cur_blocks; + + if (ei->age) + ei->age = __calculate_block_age(cur_age, ei->age); + else + ei->age = cur_age; + ei->last_blocks = cur_blocks; + WARN_ON(ei->age > cur_blocks); + return 0; + } + + f2fs_bug_on(sbi, ei->blk == NULL_ADDR); + + /* the data block was allocated for the first time */ + if (ei->blk == NEW_ADDR) + goto out; + + if (__is_valid_data_blkaddr(ei->blk) && + !f2fs_is_valid_blkaddr(sbi, ei->blk, DATA_GENERIC_ENHANCE)) { + f2fs_bug_on(sbi, 1); + return -EINVAL; + } +out: + /* + * init block age with zero, this can happen when the block age extent + * was reclaimed due to memory constraint or system reboot + */ + ei->age = 0; + ei->last_blocks = cur_blocks; + return 0; +} + static void __update_extent_cache(struct dnode_of_data *dn, enum extent_type type) { struct extent_info ei; @@ -822,6 +951,10 @@ static void __update_extent_cache(struct dnode_of_data *dn, enum extent_type typ ei.blk = NULL_ADDR; else ei.blk = dn->data_blkaddr; + } else if (type == EX_BLOCK_AGE) { + ei.blk = dn->data_blkaddr; + if (__get_new_block_age(dn->inode, &ei)) + return; } __update_extent_tree_range(dn->inode, &ei, type); } @@ -939,6 +1072,43 @@ unsigned int f2fs_shrink_read_extent_tree(struct f2fs_sb_info *sbi, int nr_shrin return __shrink_extent_tree(sbi, nr_shrink, EX_READ); } +/* block age extent cache operations */ +bool f2fs_lookup_age_extent_cache(struct inode *inode, pgoff_t pgofs, + struct extent_info *ei) +{ + if (!__may_extent_tree(inode, EX_BLOCK_AGE)) + return false; + + return __lookup_extent_tree(inode, pgofs, ei, EX_BLOCK_AGE); +} + +void f2fs_update_age_extent_cache(struct dnode_of_data *dn) +{ + return __update_extent_cache(dn, EX_BLOCK_AGE); +} + +void f2fs_update_age_extent_cache_range(struct dnode_of_data *dn, + pgoff_t fofs, unsigned int len) +{ + struct extent_info ei = { + .fofs = fofs, + .len = len, + }; + + if (!__may_extent_tree(dn->inode, EX_BLOCK_AGE)) + return; + + __update_extent_tree_range(dn->inode, &ei, EX_BLOCK_AGE); +} + +unsigned int f2fs_shrink_age_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink) +{ + if (!test_opt(sbi, AGE_EXTENT_CACHE)) + return 0; + + return __shrink_extent_tree(sbi, nr_shrink, EX_BLOCK_AGE); +} + static unsigned int __destroy_extent_node(struct inode *inode, enum extent_type type) { @@ -959,6 +1129,7 @@ static unsigned int __destroy_extent_node(struct inode *inode, void f2fs_destroy_extent_node(struct inode *inode) { __destroy_extent_node(inode, EX_READ); + __destroy_extent_node(inode, EX_BLOCK_AGE); } static void __drop_extent_tree(struct inode *inode, enum extent_type type) @@ -987,6 +1158,7 @@ static void __drop_extent_tree(struct inode *inode, enum extent_type type) void f2fs_drop_extent_tree(struct inode *inode) { __drop_extent_tree(inode, EX_READ); + __drop_extent_tree(inode, EX_BLOCK_AGE); } static void __destroy_extent_tree(struct inode *inode, enum extent_type type) @@ -1027,6 +1199,7 @@ static void __destroy_extent_tree(struct inode *inode, enum extent_type type) void f2fs_destroy_extent_tree(struct inode *inode) { __destroy_extent_tree(inode, EX_READ); + __destroy_extent_tree(inode, EX_BLOCK_AGE); } static void __init_extent_tree_info(struct extent_tree_info *eti) @@ -1044,6 +1217,12 @@ static void __init_extent_tree_info(struct extent_tree_info *eti) void f2fs_init_extent_cache_info(struct f2fs_sb_info *sbi) { __init_extent_tree_info(&sbi->extent_tree[EX_READ]); + __init_extent_tree_info(&sbi->extent_tree[EX_BLOCK_AGE]); + + /* initialize for block age extents */ + atomic64_set(&sbi->allocated_data_blocks, 0); + sbi->hot_data_age_threshold = DEF_HOT_DATA_AGE_THRESHOLD; + sbi->warm_data_age_threshold = DEF_WARM_DATA_AGE_THRESHOLD; } int __init f2fs_create_extent_cache(void) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 3dc413423be1..c2075b87ec23 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -99,6 +99,7 @@ extern const char *f2fs_fault_name[FAULT_MAX]; #define F2FS_MOUNT_MERGE_CHECKPOINT 0x10000000 #define F2FS_MOUNT_GC_MERGE 0x20000000 #define F2FS_MOUNT_COMPRESS_CACHE 0x40000000 +#define F2FS_MOUNT_AGE_EXTENT_CACHE 0x80000000 #define F2FS_OPTION(sbi) ((sbi)->mount_opt) #define clear_opt(sbi, option) (F2FS_OPTION(sbi).opt &= ~F2FS_MOUNT_##option) @@ -570,9 +571,22 @@ enum { /* number of extent info in extent cache we try to shrink */ #define READ_EXTENT_CACHE_SHRINK_NUMBER 128 +/* number of age extent info in extent cache we try to shrink */ +#define AGE_EXTENT_CACHE_SHRINK_NUMBER 128 +#define LAST_AGE_WEIGHT 30 +#define SAME_AGE_REGION 1024 + +/* + * Define data block with age less than 1GB as hot data + * define data block with age less than 10GB but more than 1GB as warm data + */ +#define DEF_HOT_DATA_AGE_THRESHOLD 262144 +#define DEF_WARM_DATA_AGE_THRESHOLD 2621440 + /* extent cache type */ enum extent_type { EX_READ, + EX_BLOCK_AGE, NR_EXTENT_CACHES, }; @@ -600,6 +614,13 @@ struct extent_info { unsigned int c_len; #endif }; + /* block age extent_cache */ + struct { + /* block age of the extent */ + unsigned long long age; + /* last total blocks allocated */ + unsigned long long last_blocks; + }; }; }; @@ -1589,6 +1610,11 @@ struct f2fs_sb_info { /* for extent tree cache */ struct extent_tree_info extent_tree[NR_EXTENT_CACHES]; + atomic64_t allocated_data_blocks; /* for block age extent_cache */ + + /* The threshold used for hot and warm data seperation*/ + unsigned int hot_data_age_threshold; + unsigned int warm_data_age_threshold; /* basic filesystem units */ unsigned int log_sectors_per_block; /* log2 sectors per block */ @@ -3770,6 +3796,8 @@ struct f2fs_stat_info { unsigned long long ext_mem[NR_EXTENT_CACHES]; /* for read extent cache */ unsigned long long hit_largest; + /* for block age extent cache */ + unsigned long long allocated_data_blocks; int ndirty_node, ndirty_dent, ndirty_meta, ndirty_imeta; int ndirty_data, ndirty_qdata; int inmem_pages; @@ -4085,6 +4113,16 @@ void f2fs_update_read_extent_cache_range(struct dnode_of_data *dn, unsigned int f2fs_shrink_read_extent_tree(struct f2fs_sb_info *sbi, int nr_shrink); +/* block age extent cache ops */ +void f2fs_init_age_extent_tree(struct inode *inode); +bool f2fs_lookup_age_extent_cache(struct inode *inode, pgoff_t pgofs, + struct extent_info *ei); +void f2fs_update_age_extent_cache(struct dnode_of_data *dn); +void f2fs_update_age_extent_cache_range(struct dnode_of_data *dn, + pgoff_t fofs, unsigned int len); +unsigned int f2fs_shrink_age_extent_tree(struct f2fs_sb_info *sbi, + int nr_shrink); + /* * sysfs.c */ diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 18da58068820..d6ded768ac09 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -608,6 +608,7 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count) fofs = f2fs_start_bidx_of_node(ofs_of_node(dn->node_page), dn->inode) + ofs; f2fs_update_read_extent_cache_range(dn, fofs, 0, len); + f2fs_update_age_extent_cache_range(dn, fofs, nr_free); dec_valid_block_count(sbi, dn->inode, nr_free); } dn->ofs_in_node = ofs; diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 935bcb160602..5bf4f1cccd71 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -470,6 +470,7 @@ static int do_read_inode(struct inode *inode) /* Need all the flag bits */ f2fs_init_read_extent_tree(inode, node_page); + f2fs_init_age_extent_tree(inode); f2fs_put_page(node_page, 1); diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 4e5cbd856df7..230c63c99b55 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -58,7 +58,7 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type) avail_ram = val.totalram - val.totalhigh; /* - * give 25%, 25%, 50%, 50%, 50% memory for each components respectively + * give 25%, 25%, 50%, 50%, 25%, 25% memory for each components respectively */ if (type == FREE_NIDS) { mem_size = (nm_i->nid_cnt[FREE_NID] * @@ -83,14 +83,16 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type) sizeof(struct ino_entry); mem_size >>= PAGE_SHIFT; res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1); - } else if (type == READ_EXTENT_CACHE) { - struct extent_tree_info *eti = &sbi->extent_tree[EX_READ]; + } else if (type == READ_EXTENT_CACHE || type == AGE_EXTENT_CACHE) { + enum extent_type etype = type == READ_EXTENT_CACHE ? + EX_READ : EX_BLOCK_AGE; + struct extent_tree_info *eti = &sbi->extent_tree[etype]; mem_size = (atomic_read(&eti->total_ext_tree) * sizeof(struct extent_tree) + atomic_read(&eti->total_ext_node) * sizeof(struct extent_node)) >> PAGE_SHIFT; - res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1); + res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 2); } else if (type == INMEM_PAGES) { /* it allows 20% / total_ram for inmemory pages */ mem_size = get_pages(sbi, F2FS_INMEM_PAGES); diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h index 2c152e677a06..3de7891c8e79 100644 --- a/fs/f2fs/node.h +++ b/fs/f2fs/node.h @@ -149,6 +149,7 @@ enum mem_type { DIRTY_DENTS, /* indicates dirty dentry pages */ INO_ENTRIES, /* indicates inode entries */ READ_EXTENT_CACHE, /* indicates read extent cache */ + AGE_EXTENT_CACHE, /* indicates age extent cache */ INMEM_PAGES, /* indicates inmemory pages */ DISCARD_CACHE, /* indicates memory of cached discard cmds */ COMPRESS_PAGE, /* indicates memory of cached compressed pages */ diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 512781ad3e61..63245ab3a27c 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -540,6 +540,11 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi, bool from_bg) f2fs_shrink_read_extent_tree(sbi, READ_EXTENT_CACHE_SHRINK_NUMBER); + /* try to shrink age extent cache when there is no enough memory */ + if (!f2fs_available_free_memory(sbi, AGE_EXTENT_CACHE)) + f2fs_shrink_age_extent_tree(sbi, + AGE_EXTENT_CACHE_SHRINK_NUMBER); + /* check the # of cached NAT entries */ if (!f2fs_available_free_memory(sbi, NAT_ENTRIES)) f2fs_try_to_free_nats(sbi, NAT_ENTRY_PER_BLOCK); @@ -3293,10 +3298,28 @@ static int __get_segment_type_4(struct f2fs_io_info *fio) } } +static int __get_age_segment_type(struct inode *inode, pgoff_t pgofs) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + struct extent_info ei; + + if (f2fs_lookup_age_extent_cache(inode, pgofs, &ei)) { + if (!ei.age) + return NO_CHECK_TYPE; + if (ei.age <= sbi->hot_data_age_threshold) + return CURSEG_HOT_DATA; + if (ei.age <= sbi->warm_data_age_threshold) + return CURSEG_WARM_DATA; + return CURSEG_COLD_DATA; + } + return NO_CHECK_TYPE; +} + static int __get_segment_type_6(struct f2fs_io_info *fio) { if (fio->type == DATA) { struct inode *inode = fio->page->mapping->host; + int type; if (is_inode_flag_set(inode, FI_ALIGNED_WRITE)) return CURSEG_COLD_DATA_PINNED; @@ -3311,6 +3334,11 @@ static int __get_segment_type_6(struct f2fs_io_info *fio) } if (file_is_cold(inode) || f2fs_need_compress_data(inode)) return CURSEG_COLD_DATA; + + type = __get_age_segment_type(inode, fio->page->index); + if (type != NO_CHECK_TYPE) + return type; + if (file_is_hot(inode) || is_inode_flag_set(inode, FI_HOT_DATA) || f2fs_is_atomic_file(inode) || @@ -3422,6 +3450,9 @@ void f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page, locate_dirty_segment(sbi, GET_SEGNO(sbi, old_blkaddr)); locate_dirty_segment(sbi, GET_SEGNO(sbi, *new_blkaddr)); + if (IS_DATASEG(type)) + atomic64_inc(&sbi->allocated_data_blocks); + up_write(&sit_i->sentry_lock); if (page && IS_NODESEG(type)) { @@ -3543,6 +3574,8 @@ void f2fs_outplace_write_data(struct dnode_of_data *dn, struct f2fs_summary sum; f2fs_bug_on(sbi, dn->data_blkaddr == NULL_ADDR); + if (fio->io_type == FS_DATA_IO || fio->io_type == FS_CP_DATA_IO) + f2fs_update_age_extent_cache(dn); set_summary(&sum, dn->nid, dn->ofs_in_node, fio->version); do_write_page(&sum, fio); f2fs_update_data_blkaddr(dn, fio->new_blkaddr); diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c index 33c490e69ae3..83d6fb97dcae 100644 --- a/fs/f2fs/shrinker.c +++ b/fs/f2fs/shrinker.c @@ -59,6 +59,9 @@ unsigned long f2fs_shrink_count(struct shrinker *shrink, /* count read extent cache entries */ count += __count_extent_cache(sbi, EX_READ); + /* count block age extent cache entries */ + count += __count_extent_cache(sbi, EX_BLOCK_AGE); + /* count clean nat cache entries */ count += __count_nat_entries(sbi); @@ -102,8 +105,11 @@ unsigned long f2fs_shrink_scan(struct shrinker *shrink, sbi->shrinker_run_no = run_no; + /* shrink extent cache entries */ + freed += f2fs_shrink_age_extent_tree(sbi, nr >> 2); + /* shrink read extent cache entries */ - freed += f2fs_shrink_read_extent_tree(sbi, nr >> 1); + freed += f2fs_shrink_read_extent_tree(sbi, nr >> 2); /* shrink clean nat cache entries */ if (freed < nr) @@ -134,6 +140,8 @@ void f2fs_join_shrinker(struct f2fs_sb_info *sbi) void f2fs_leave_shrinker(struct f2fs_sb_info *sbi) { f2fs_shrink_read_extent_tree(sbi, __count_extent_cache(sbi, EX_READ)); + f2fs_shrink_age_extent_tree(sbi, + __count_extent_cache(sbi, EX_BLOCK_AGE)); spin_lock(&f2fs_list_lock); list_del_init(&sbi->s_list); diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index 98da6a8636e0..9650528d8f65 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -154,6 +154,7 @@ enum { Opt_atgc, Opt_gc_merge, Opt_nogc_merge, + Opt_age_extent_cache, Opt_err, }; @@ -229,6 +230,7 @@ static match_table_t f2fs_tokens = { {Opt_atgc, "atgc"}, {Opt_gc_merge, "gc_merge"}, {Opt_nogc_merge, "nogc_merge"}, + {Opt_age_extent_cache, "age_extent_cache"}, {Opt_err, NULL}, }; @@ -1148,6 +1150,9 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount) case Opt_nogc_merge: clear_opt(sbi, GC_MERGE); break; + case Opt_age_extent_cache: + set_opt(sbi, AGE_EXTENT_CACHE); + break; default: f2fs_err(sbi, "Unrecognized mount option \"%s\" or missing value", p); @@ -1821,6 +1826,8 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root) seq_puts(seq, ",extent_cache"); else seq_puts(seq, ",noextent_cache"); + if (test_opt(sbi, AGE_EXTENT_CACHE)) + seq_puts(seq, ",age_extent_cache"); if (test_opt(sbi, DATA_FLUSH)) seq_puts(seq, ",data_flush"); @@ -2043,6 +2050,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data) bool need_restart_ckpt = false, need_stop_ckpt = false; bool need_restart_flush = false, need_stop_flush = false; bool no_read_extent_cache = !test_opt(sbi, READ_EXTENT_CACHE); + bool no_age_extent_cache = !test_opt(sbi, AGE_EXTENT_CACHE); bool disable_checkpoint = test_opt(sbi, DISABLE_CHECKPOINT); bool no_io_align = !F2FS_IO_ALIGNED(sbi); bool no_atgc = !test_opt(sbi, ATGC); @@ -2137,6 +2145,12 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data) f2fs_warn(sbi, "switch extent_cache option is not allowed"); goto restore_opts; } + /* disallow enable/disable age extent_cache dynamically */ + if (no_age_extent_cache == !!test_opt(sbi, AGE_EXTENT_CACHE)) { + err = -EINVAL; + f2fs_warn(sbi, "switch age_extent_cache option is not allowed"); + goto restore_opts; + } if (no_io_align == !!F2FS_IO_ALIGNED(sbi)) { err = -EINVAL; diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c index 23eba7514c9c..c9217d7a941f 100644 --- a/fs/f2fs/sysfs.c +++ b/fs/f2fs/sysfs.c @@ -549,6 +549,24 @@ out: return count; } + if (!strcmp(a->attr.name, "hot_data_age_threshold")) { + if (t == 0 || t >= sbi->warm_data_age_threshold) + return -EINVAL; + if (t == *ui) + return count; + *ui = (unsigned int)t; + return count; + } + + if (!strcmp(a->attr.name, "warm_data_age_threshold")) { + if (t == 0 || t <= sbi->hot_data_age_threshold) + return -EINVAL; + if (t == *ui) + return count; + *ui = (unsigned int)t; + return count; + } + *ui = (unsigned int)t; return count; @@ -778,6 +796,10 @@ F2FS_RW_ATTR(ATGC_INFO, atgc_management, atgc_age_threshold, age_threshold); F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_segment_mode, gc_segment_mode); F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_reclaimed_segments, gc_reclaimed_segs); +/* For block age extent cache */ +F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, hot_data_age_threshold, hot_data_age_threshold); +F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, warm_data_age_threshold, warm_data_age_threshold); + #define ATTR_LIST(name) (&f2fs_attr_##name.attr) static struct attribute *f2fs_attrs[] = { ATTR_LIST(gc_urgent_sleep_time), @@ -853,6 +875,8 @@ static struct attribute *f2fs_attrs[] = { ATTR_LIST(atgc_age_threshold), ATTR_LIST(gc_segment_mode), ATTR_LIST(gc_reclaimed_segments), + ATTR_LIST(hot_data_age_threshold), + ATTR_LIST(warm_data_age_threshold), NULL, }; ATTRIBUTE_GROUPS(f2fs); diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h index 81cb234cfaf6..e927889c7fd5 100644 --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -53,6 +53,7 @@ TRACE_DEFINE_ENUM(CP_TRIMMED); TRACE_DEFINE_ENUM(CP_PAUSE); TRACE_DEFINE_ENUM(CP_RESIZE); TRACE_DEFINE_ENUM(EX_READ); +TRACE_DEFINE_ENUM(EX_BLOCK_AGE); #define show_block_type(type) \ __print_symbolic(type, \ @@ -163,6 +164,11 @@ TRACE_DEFINE_ENUM(EX_READ); { COMPRESS_ZSTD, "ZSTD" }, \ { COMPRESS_LZORLE, "LZO-RLE" }) +#define show_extent_type(type) \ + __print_symbolic(type, \ + { EX_READ, "Read" }, \ + { EX_BLOCK_AGE, "Block Age" }) + struct f2fs_sb_info; struct f2fs_io_info; struct extent_info; @@ -1548,7 +1554,7 @@ TRACE_EVENT(f2fs_lookup_extent_tree_start, TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, type = %s", show_dev_ino(__entry), __entry->pgofs, - __entry->type == EX_READ ? "Read" : "N/A") + show_extent_type(__entry->type)) ); TRACE_EVENT_CONDITION(f2fs_lookup_read_extent_tree_end, @@ -1587,6 +1593,45 @@ TRACE_EVENT_CONDITION(f2fs_lookup_read_extent_tree_end, __entry->blk) ); +TRACE_EVENT_CONDITION(f2fs_lookup_age_extent_tree_end, + + TP_PROTO(struct inode *inode, unsigned int pgofs, + struct extent_info *ei), + + TP_ARGS(inode, pgofs, ei), + + TP_CONDITION(ei), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __field(unsigned int, pgofs) + __field(unsigned int, fofs) + __field(unsigned int, len) + __field(unsigned long long, age) + __field(unsigned long long, blocks) + ), + + TP_fast_assign( + __entry->dev = inode->i_sb->s_dev; + __entry->ino = inode->i_ino; + __entry->pgofs = pgofs; + __entry->fofs = ei->fofs; + __entry->len = ei->len; + __entry->age = ei->age; + __entry->blocks = ei->last_blocks; + ), + + TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, " + "age_ext_info(fofs: %u, len: %u, age: %llu, blocks: %llu)", + show_dev_ino(__entry), + __entry->pgofs, + __entry->fofs, + __entry->len, + __entry->age, + __entry->blocks) +); + TRACE_EVENT(f2fs_update_read_extent_tree_range, TP_PROTO(struct inode *inode, unsigned int pgofs, unsigned int len, @@ -1622,6 +1667,41 @@ TRACE_EVENT(f2fs_update_read_extent_tree_range, __entry->c_len) ); +TRACE_EVENT(f2fs_update_age_extent_tree_range, + + TP_PROTO(struct inode *inode, unsigned int pgofs, unsigned int len, + unsigned long long age, + unsigned long long last_blks), + + TP_ARGS(inode, pgofs, len, age, last_blks), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, ino) + __field(unsigned int, pgofs) + __field(unsigned int, len) + __field(unsigned long long, age) + __field(unsigned long long, blocks) + ), + + TP_fast_assign( + __entry->dev = inode->i_sb->s_dev; + __entry->ino = inode->i_ino; + __entry->pgofs = pgofs; + __entry->len = len; + __entry->age = age; + __entry->blocks = last_blks; + ), + + TP_printk("dev = (%d,%d), ino = %lu, pgofs = %u, " + "len = %u, age = %llu, blocks = %llu", + show_dev_ino(__entry), + __entry->pgofs, + __entry->len, + __entry->age, + __entry->blocks) +); + TRACE_EVENT(f2fs_shrink_extent_tree, TP_PROTO(struct f2fs_sb_info *sbi, unsigned int node_cnt, @@ -1647,7 +1727,7 @@ TRACE_EVENT(f2fs_shrink_extent_tree, show_dev(__entry->dev), __entry->node_cnt, __entry->tree_cnt, - __entry->type == EX_READ ? "Read" : "N/A") + show_extent_type(__entry->type)) ); TRACE_EVENT(f2fs_destroy_extent_tree, @@ -1674,7 +1754,7 @@ TRACE_EVENT(f2fs_destroy_extent_tree, TP_printk("dev = (%d,%d), ino = %lu, destroyed: node_cnt = %u, type = %s", show_dev_ino(__entry), __entry->node_cnt, - __entry->type == EX_READ ? "Read" : "N/A") + show_extent_type(__entry->type)) ); DECLARE_EVENT_CLASS(f2fs_sync_dirty_inodes, From 073b997b02e7df2eec191d1a5278e20e73cfe126 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Fri, 16 Dec 2022 14:05:44 -0800 Subject: [PATCH 45/51] BACKPORT: f2fs: initialize extent_cache parameter This can avoid confusing tracepoint values. Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit b5825de803e7cf56262f5b3a8bc692d61acfe653) Change-Id: Ia43e2c1fd405a11fc4122b68f05f40c10f47f263 --- fs/f2fs/data.c | 2 +- fs/f2fs/extent_cache.c | 2 +- fs/f2fs/file.c | 2 +- fs/f2fs/segment.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 991b4f9593b1..cce1a178106f 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -2156,7 +2156,7 @@ int f2fs_read_multi_pages(struct compress_ctx *cc, struct bio **bio_ret, sector_t last_block_in_file; const unsigned blocksize = blks_to_bytes(inode, 1); struct decompress_io_ctx *dic = NULL; - struct extent_info ei = {0, }; + struct extent_info ei = {}; bool from_dnode = true; int i; int ret = 0; diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index 9a6db0b1b0eb..2d45da3d8dcc 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -937,7 +937,7 @@ out: static void __update_extent_cache(struct dnode_of_data *dn, enum extent_type type) { - struct extent_info ei; + struct extent_info ei = {}; if (!__may_extent_tree(dn->inode, type)) return; diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index d6ded768ac09..ef49b3684c0c 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -2591,7 +2591,7 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, struct f2fs_map_blocks map = { .m_next_extent = NULL, .m_seg_type = NO_CHECK_TYPE, .m_may_create = false }; - struct extent_info ei = {0, }; + struct extent_info ei = {}; pgoff_t pg_start, pg_end, next_pgofs; unsigned int blk_per_seg = sbi->blocks_per_seg; unsigned int total = 0, sec_num; diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 63245ab3a27c..744aa9274b37 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -3301,7 +3301,7 @@ static int __get_segment_type_4(struct f2fs_io_info *fio) static int __get_age_segment_type(struct inode *inode, pgoff_t pgofs) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); - struct extent_info ei; + struct extent_info ei = {}; if (f2fs_lookup_age_extent_cache(inode, pgofs, &ei)) { if (!ei.age) From 937ed4edda40580f289543bca376ead46a5bd7a2 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Fri, 16 Dec 2022 14:41:54 -0800 Subject: [PATCH 46/51] BACKPORT: f2fs: don't mix to use union values in extent_info Let's explicitly use the defined values in block_age case only. Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit 354a326851a615c7fd8f5a6bda72afc9f051c264) Change-Id: I4011cbf10117a9023ef6fc507726020159ead72d --- fs/f2fs/extent_cache.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index 2d45da3d8dcc..3c77b9a982e4 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -880,7 +880,8 @@ static unsigned long long __calculate_block_age(unsigned long long new, } /* This returns a new age and allocated blocks in ei */ -static int __get_new_block_age(struct inode *inode, struct extent_info *ei) +static int __get_new_block_age(struct inode *inode, struct extent_info *ei, + block_t blkaddr) { struct f2fs_sb_info *sbi = F2FS_I_SB(inode); loff_t f_size = i_size_read(inode); @@ -893,7 +894,7 @@ static int __get_new_block_age(struct inode *inode, struct extent_info *ei) * block here. */ if ((f_size >> PAGE_SHIFT) == ei->fofs && f_size & (PAGE_SIZE - 1) && - ei->blk == NEW_ADDR) + blkaddr == NEW_ADDR) return -EINVAL; if (__lookup_extent_tree(inode, ei->fofs, ei, EX_BLOCK_AGE)) { @@ -914,14 +915,14 @@ static int __get_new_block_age(struct inode *inode, struct extent_info *ei) return 0; } - f2fs_bug_on(sbi, ei->blk == NULL_ADDR); + f2fs_bug_on(sbi, blkaddr == NULL_ADDR); /* the data block was allocated for the first time */ - if (ei->blk == NEW_ADDR) + if (blkaddr == NEW_ADDR) goto out; - if (__is_valid_data_blkaddr(ei->blk) && - !f2fs_is_valid_blkaddr(sbi, ei->blk, DATA_GENERIC_ENHANCE)) { + if (__is_valid_data_blkaddr(blkaddr) && + !f2fs_is_valid_blkaddr(sbi, blkaddr, DATA_GENERIC_ENHANCE)) { f2fs_bug_on(sbi, 1); return -EINVAL; } @@ -952,8 +953,7 @@ static void __update_extent_cache(struct dnode_of_data *dn, enum extent_type typ else ei.blk = dn->data_blkaddr; } else if (type == EX_BLOCK_AGE) { - ei.blk = dn->data_blkaddr; - if (__get_new_block_age(dn->inode, &ei)) + if (__get_new_block_age(dn->inode, &ei, dn->data_blkaddr)) return; } __update_extent_tree_range(dn->inode, &ei, type); From 39b8fee3c007ea11f13930ce22b4ff7b4f5afe0e Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Fri, 16 Dec 2022 16:36:36 -0800 Subject: [PATCH 47/51] BACKPORT: f2fs: should use a temp extent_info for lookup Otherwise, __lookup_extent_tree() will override the given extent_info which will be used by caller. Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit db640d99b1ed4745eaf3af9ea1910996cddaf30c) Change-Id: Ib37a9ec57c24cfe303ee23a5e90618e6e0dabe61 --- fs/f2fs/extent_cache.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index 3c77b9a982e4..e9665c0cc386 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -887,6 +887,7 @@ static int __get_new_block_age(struct inode *inode, struct extent_info *ei, loff_t f_size = i_size_read(inode); unsigned long long cur_blocks = atomic64_read(&sbi->allocated_data_blocks); + struct extent_info tei = *ei; /* only fofs and len are valid */ /* * When I/O is not aligned to a PAGE_SIZE, update will happen to the last @@ -897,17 +898,17 @@ static int __get_new_block_age(struct inode *inode, struct extent_info *ei, blkaddr == NEW_ADDR) return -EINVAL; - if (__lookup_extent_tree(inode, ei->fofs, ei, EX_BLOCK_AGE)) { + if (__lookup_extent_tree(inode, ei->fofs, &tei, EX_BLOCK_AGE)) { unsigned long long cur_age; - if (cur_blocks >= ei->last_blocks) - cur_age = cur_blocks - ei->last_blocks; + if (cur_blocks >= tei.last_blocks) + cur_age = cur_blocks - tei.last_blocks; else /* allocated_data_blocks overflow */ - cur_age = ULLONG_MAX - ei->last_blocks + cur_blocks; + cur_age = ULLONG_MAX - tei.last_blocks + cur_blocks; - if (ei->age) - ei->age = __calculate_block_age(cur_age, ei->age); + if (tei.age) + ei->age = __calculate_block_age(cur_age, tei.age); else ei->age = cur_age; ei->last_blocks = cur_blocks; From 6e50bbff175ea5de46d23673fd146af3f7dfcec4 Mon Sep 17 00:00:00 2001 From: Jaegeuk Kim Date: Wed, 21 Dec 2022 16:14:10 -0800 Subject: [PATCH 48/51] BACKPORT: f2fs: let's avoid panic if extent_tree is not created This patch avoids the below panic. pc : __lookup_extent_tree+0xd8/0x760 lr : f2fs_do_write_data_page+0x104/0x87c sp : ffffffc010cbb3c0 x29: ffffffc010cbb3e0 x28: 0000000000000000 x27: ffffff8803e7f020 x26: ffffff8803e7ed40 x25: ffffff8803e7f020 x24: ffffffc010cbb460 x23: ffffffc010cbb480 x22: 0000000000000000 x21: 0000000000000000 x20: ffffffff22e90900 x19: 0000000000000000 x18: ffffffc010c5d080 x17: 0000000000000000 x16: 0000000000000020 x15: ffffffdb1acdbb88 x14: ffffff888759e2b0 x13: 0000000000000000 x12: ffffff802da49000 x11: 000000000a001200 x10: ffffff8803e7ed40 x9 : ffffff8023195800 x8 : ffffff802da49078 x7 : 0000000000000001 x6 : 0000000000000000 x5 : 0000000000000006 x4 : ffffffc010cbba28 x3 : 0000000000000000 x2 : ffffffc010cbb480 x1 : 0000000000000000 x0 : ffffff8803e7ed40 Call trace: __lookup_extent_tree+0xd8/0x760 f2fs_do_write_data_page+0x104/0x87c f2fs_write_single_data_page+0x420/0xb60 f2fs_write_cache_pages+0x418/0xb1c __f2fs_write_data_pages+0x428/0x58c f2fs_write_data_pages+0x30/0x40 do_writepages+0x88/0x190 __writeback_single_inode+0x48/0x448 writeback_sb_inodes+0x468/0x9e8 __writeback_inodes_wb+0xb8/0x2a4 wb_writeback+0x33c/0x740 wb_do_writeback+0x2b4/0x400 wb_workfn+0xe4/0x34c process_one_work+0x24c/0x5bc worker_thread+0x3e8/0xa50 kthread+0x150/0x1b4 Bug: 264453689 Signed-off-by: Jaegeuk Kim (cherry picked from commit 24af2f08d60039427995f78150963743dcb080de) Change-Id: I7594e80fb7df0dff3f494e79be763a9870c8f063 --- fs/f2fs/extent_cache.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/extent_cache.c b/fs/f2fs/extent_cache.c index e9665c0cc386..567d07f6effe 100644 --- a/fs/f2fs/extent_cache.c +++ b/fs/f2fs/extent_cache.c @@ -545,7 +545,8 @@ static bool __lookup_extent_tree(struct inode *inode, pgoff_t pgofs, struct extent_node *en; bool ret = false; - f2fs_bug_on(sbi, !et); + if (!et) + return false; trace_f2fs_lookup_extent_tree_start(inode, pgofs, type); From ab89185ddb154ad928d5c29f24388ca6825b3098 Mon Sep 17 00:00:00 2001 From: Kever Yang Date: Mon, 9 Jan 2023 10:19:33 +0800 Subject: [PATCH 49/51] ANDROID: GKI: rockchip: Update symbols Leaf changes summary: 0 artifacts changed Changed leaf types summary: 0 leaf type changed Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 2 Added functions Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable 2 Added functions: [A] 'function void drm_hdcp_update_content_protection(drm_connector*, u64)' [A] 'function void sdhci_reset_tuning(sdhci_host*)' Bug: 239396464 Signed-off-by: Kever Yang Change-Id: I9d15fb674c1bd308e88ad34352092deef60eafcc --- android/abi_gki_aarch64.xml | 11 +++++++++++ android/abi_gki_aarch64_rockchip | 14 +++++++++++++- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/android/abi_gki_aarch64.xml b/android/abi_gki_aarch64.xml index bda4150b97dd..b511374aa82e 100644 --- a/android/abi_gki_aarch64.xml +++ b/android/abi_gki_aarch64.xml @@ -2261,6 +2261,7 @@ + @@ -4780,6 +4781,7 @@ + @@ -129317,6 +129319,11 @@ + + + + + @@ -142184,6 +142191,10 @@ + + + + diff --git a/android/abi_gki_aarch64_rockchip b/android/abi_gki_aarch64_rockchip index ec0b9c854e41..aabe57eaca5f 100644 --- a/android/abi_gki_aarch64_rockchip +++ b/android/abi_gki_aarch64_rockchip @@ -1087,6 +1087,7 @@ scsi_ioctl_block_when_processing_errors sdev_prefix_printk sdhci_add_host + sdhci_execute_tuning sdhci_get_property sdhci_pltfm_clk_get_max_clock sdhci_pltfm_free @@ -2442,9 +2443,11 @@ component_unbind_all devm_of_phy_get_by_index driver_find_device + drm_add_modes_noedid drm_atomic_commit drm_atomic_get_connector_state drm_atomic_get_plane_state + drm_atomic_helper_bridge_propagate_bus_fmt drm_atomic_helper_check drm_atomic_helper_check_plane_state drm_atomic_helper_cleanup_planes @@ -2477,6 +2480,10 @@ __drm_atomic_state_free drm_bridge_chain_mode_set drm_bridge_get_edid + drm_connector_attach_content_protection_property + drm_connector_list_iter_begin + drm_connector_list_iter_end + drm_connector_list_iter_next drm_connector_list_update drm_crtc_cleanup drm_crtc_enable_color_mgmt @@ -2522,6 +2529,7 @@ drm_gem_unmap_dma_buf drm_get_format_info drm_get_format_name + drm_hdcp_update_content_protection drm_helper_mode_fill_fb_struct drm_kms_helper_poll_enable drm_kms_helper_poll_fini @@ -2547,6 +2555,7 @@ drm_mode_prune_invalid drm_mode_set_crtcinfo drm_modeset_lock_all + drm_modeset_unlock drm_modeset_unlock_all drm_mode_sort drm_mode_validate_size @@ -2560,6 +2569,7 @@ drm_plane_create_zpos_property drm_prime_get_contiguous_size __drm_printfn_seq_file + drm_property_blob_put drm_property_create drm_property_create_bitmask drm_property_create_bool @@ -2567,6 +2577,8 @@ drm_property_create_object drm_property_create_range drm_property_destroy + drm_property_lookup_blob + drm_property_replace_blob __drm_puts_seq_file drm_rect_calc_hscale drm_self_refresh_helper_cleanup @@ -2608,7 +2620,6 @@ sdhci_cqe_irq sdhci_dumpregs sdhci_enable_clk - sdhci_execute_tuning sdhci_pltfm_unregister sdhci_set_power_and_bus_voltage sdhci_set_uhs_signaling @@ -2620,6 +2631,7 @@ sdhci_adma_write_desc sdhci_remove_host sdhci_request + sdhci_reset_tuning # required by sg.ko blk_get_request From c83ab50b6e835f5f63100153229944a10ea90763 Mon Sep 17 00:00:00 2001 From: Ye Bin Date: Thu, 14 Apr 2022 10:52:23 +0800 Subject: [PATCH 50/51] BACKPORT: ext4: fix use-after-free in ext4_rename_dir_prepare commit 0be698ecbe4471fcad80e81ec6a05001421041b3 upstream. We got issue as follows: EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue ext4_get_first_dir_block: bh->b_data=0xffff88810bee6000 len=34478 ext4_get_first_dir_block: *parent_de=0xffff88810beee6ae bh->b_data=0xffff88810bee6000 ext4_rename_dir_prepare: [1] parent_de=0xffff88810beee6ae ================================================================== BUG: KASAN: use-after-free in ext4_rename_dir_prepare+0x152/0x220 Read of size 4 at addr ffff88810beee6ae by task rep/1895 CPU: 13 PID: 1895 Comm: rep Not tainted 5.10.0+ #241 Call Trace: dump_stack+0xbe/0xf9 print_address_description.constprop.0+0x1e/0x220 kasan_report.cold+0x37/0x7f ext4_rename_dir_prepare+0x152/0x220 ext4_rename+0xf44/0x1ad0 ext4_rename2+0x11c/0x170 vfs_rename+0xa84/0x1440 do_renameat2+0x683/0x8f0 __x64_sys_renameat+0x53/0x60 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f45a6fc41c9 RSP: 002b:00007ffc5a470218 EFLAGS: 00000246 ORIG_RAX: 0000000000000108 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f45a6fc41c9 RDX: 0000000000000005 RSI: 0000000020000180 RDI: 0000000000000005 RBP: 00007ffc5a470240 R08: 00007ffc5a470160 R09: 0000000020000080 R10: 00000000200001c0 R11: 0000000000000246 R12: 0000000000400bb0 R13: 00007ffc5a470320 R14: 0000000000000000 R15: 0000000000000000 The buggy address belongs to the page: page:00000000440015ce refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x10beee flags: 0x200000000000000() raw: 0200000000000000 ffffea00043ff4c8 ffffea0004325608 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88810beee580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88810beee600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff88810beee680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff88810beee700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88810beee780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Disabling lock debugging due to kernel taint ext4_rename_dir_prepare: [2] parent_de->inode=3537895424 ext4_rename_dir_prepare: [3] dir=0xffff888124170140 ext4_rename_dir_prepare: [4] ino=2 ext4_rename_dir_prepare: ent->dir->i_ino=2 parent=-757071872 Reason is first directory entry which 'rec_len' is 34478, then will get illegal parent entry. Now, we do not check directory entry after read directory block in 'ext4_get_first_dir_block'. To solve this issue, check directory entry in 'ext4_get_first_dir_block'. [ Trigger an ext4_error() instead of just warning if the directory is missing a '.' or '..' entry. Also make sure we return an error code if the file system is corrupted. -TYT ] Signed-off-by: Ye Bin Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20220414025223.4113128-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman [ta: Adapt patch to cope with the android specific changes introduced in commit 705a3e5b1852 ("ANDROID: ext4: Handle casefolding with encryption"). Pass zero value for lblk when calling ext4_check_dir_entry().] Cc: Daniel Rosenberg Reported-and-tested-by: syzbot+a07b88e6427ec1c97aa5@syzkaller.appspotmail.com Signed-off-by: Tudor Ambarus Change-Id: I9d4218ffa0ddae2aa75aa4755221ef7f856b04e9 --- fs/ext4/namei.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 6b27be96b6e5..3a71f883e73f 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -3647,6 +3647,9 @@ static struct buffer_head *ext4_get_first_dir_block(handle_t *handle, struct buffer_head *bh; if (!ext4_has_inline_data(inode)) { + struct ext4_dir_entry_2 *de; + unsigned int offset; + /* The first directory block must not be a hole, so * treat it as DIRENT_HTREE */ @@ -3655,9 +3658,30 @@ static struct buffer_head *ext4_get_first_dir_block(handle_t *handle, *retval = PTR_ERR(bh); return NULL; } - *parent_de = ext4_next_entry( - (struct ext4_dir_entry_2 *)bh->b_data, - inode->i_sb->s_blocksize); + + de = (struct ext4_dir_entry_2 *) bh->b_data; + if (ext4_check_dir_entry(inode, NULL, de, bh, bh->b_data, + bh->b_size, 0, 0) || + le32_to_cpu(de->inode) != inode->i_ino || + strcmp(".", de->name)) { + EXT4_ERROR_INODE(inode, "directory missing '.'"); + brelse(bh); + *retval = -EFSCORRUPTED; + return NULL; + } + offset = ext4_rec_len_from_disk(de->rec_len, + inode->i_sb->s_blocksize); + de = ext4_next_entry(de, inode->i_sb->s_blocksize); + if (ext4_check_dir_entry(inode, NULL, de, bh, bh->b_data, + bh->b_size, 0, offset) || + le32_to_cpu(de->inode) == 0 || strcmp("..", de->name)) { + EXT4_ERROR_INODE(inode, "directory missing '..'"); + brelse(bh); + *retval = -EFSCORRUPTED; + return NULL; + } + *parent_de = de; + return bh; } From da3d81545d1d789164aeb5918991f7a4c4844744 Mon Sep 17 00:00:00 2001 From: Srinivasarao Pathipati Date: Fri, 27 Jan 2023 11:34:03 +0530 Subject: [PATCH 51/51] ANDROID: cpu: correct dl_cpu_busy() calls The patch 0039189a3b15 ("sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()") which is picked from upstream modifies declaration of function dl_cpu_busy(). But it won't update function usage from android specific code that introduced with patch 683010f555d8 ("ANDROID: cpu/hotplug: add pause/resume_cpus interface"). Bug: 266874695 Fixes: 0039189a3b15 ("sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()" Change-Id: I40c12f912b7fe854b1e2e13f75c727c3c9a2435c Signed-off-by: Srinivasarao Pathipati --- kernel/cpu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index fc15c01d61b8..9bd53e65246b 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1159,7 +1159,7 @@ int remove_cpu(unsigned int cpu) } EXPORT_SYMBOL_GPL(remove_cpu); -extern bool dl_cpu_busy(unsigned int cpu); +extern int dl_cpu_busy(int cpu, struct task_struct *p); int __pause_drain_rq(struct cpumask *cpus) { @@ -1234,7 +1234,7 @@ int pause_cpus(struct cpumask *cpus) cpumask_and(cpus, cpus, cpu_active_mask); for_each_cpu(cpu, cpus) { - if (!cpu_online(cpu) || dl_cpu_busy(cpu) || + if (!cpu_online(cpu) || dl_cpu_busy(cpu, NULL) || get_cpu_device(cpu)->offline_disabled == true) { err = -EBUSY; goto err_cpu_maps_update;