49c25af89c6ad02b60a1a2bcbe16ecef330782cf
682 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
49c25af89c |
Revert "bpf: Remove extra lock_sock for TCP_ZEROCOPY_RECEIVE"
This reverts commit
|
||
![]() |
477f5e6b9e |
Merge 5.10.188 into android12-5.10-lts
Changes in 5.10.188 media: atomisp: fix "variable dereferenced before check 'asd'" x86/smp: Use dedicated cache-line for mwait_play_dead() can: isotp: isotp_sendmsg(): fix return error fix on TX path video: imsttfb: check for ioremap() failures fbdev: imsttfb: Fix use after free bug in imsttfb_probe HID: wacom: Use ktime_t rather than int when dealing with timestamps HID: logitech-hidpp: add HIDPP_QUIRK_DELAYED_INIT for the T651. Revert "thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe" scripts/tags.sh: Resolve gtags empty index generation drm/amdgpu: Validate VM ioctl flags. nubus: Partially revert proc_create_single_data() conversion fs: pipe: reveal missing function protoypes x86/resctrl: Only show tasks' pid in current pid namespace blk-iocost: use spin_lock_irqsave in adjust_inuse_and_calc_cost md/raid10: check slab-out-of-bounds in md_bitmap_get_counter md/raid10: fix overflow of md/safe_mode_delay md/raid10: fix wrong setting of max_corr_read_errors md/raid10: fix null-ptr-deref of mreplace in raid10_sync_request md/raid10: fix io loss while replacement replace rdev irqchip/jcore-aic: Kill use of irq_create_strict_mappings() irqchip/jcore-aic: Fix missing allocation of IRQ descriptors posix-timers: Prevent RT livelock in itimer_delete() tracing/timer: Add missing hrtimer modes to decode_hrtimer_mode(). clocksource/drivers/cadence-ttc: Fix memory leak in ttc_timer_probe PM: domains: fix integer overflow issues in genpd_parse_state() perf/arm-cmn: Fix DTC reset powercap: RAPL: Fix CONFIG_IOSF_MBI dependency ARM: 9303/1: kprobes: avoid missing-declaration warnings cpufreq: intel_pstate: Fix energy_performance_preference for passive thermal/drivers/sun8i: Fix some error handling paths in sun8i_ths_probe() rcuscale: Console output claims too few grace periods rcuscale: Always log error message rcuscale: Move shutdown from wait_event() to wait_event_idle() rcu/rcuscale: Move rcu_scale_*() after kfree_scale_cleanup() rcu/rcuscale: Stop kfree_scale_thread thread(s) after unloading rcuscale perf/ibs: Fix interface via core pmu events x86/mm: Fix __swp_entry_to_pte() for Xen PV guests evm: Complete description of evm_inode_setattr() ima: Fix build warnings pstore/ram: Add check for kstrdup igc: Enable and fix RX hash usage by netstack wifi: ath9k: fix AR9003 mac hardware hang check register offset calculation wifi: ath9k: avoid referencing uninit memory in ath9k_wmi_ctrl_rx samples/bpf: Fix buffer overflow in tcp_basertt spi: spi-geni-qcom: Correct CS_TOGGLE bit in SPI_TRANS_CFG wifi: wilc1000: fix for absent RSN capabilities WFA testcase wifi: mwifiex: Fix the size of a memory allocation in mwifiex_ret_802_11_scan() bpf: Remove extra lock_sock for TCP_ZEROCOPY_RECEIVE sctp: add bpf_bypass_getsockopt proto callback libbpf: fix offsetof() and container_of() to work with CO-RE nfc: constify several pointers to u8, char and sk_buff nfc: llcp: fix possible use of uninitialized variable in nfc_llcp_send_connect() bpftool: JIT limited misreported as negative value on aarch64 regulator: core: Fix more error checking for debugfs_create_dir() regulator: core: Streamline debugfs operations wifi: orinoco: Fix an error handling path in spectrum_cs_probe() wifi: orinoco: Fix an error handling path in orinoco_cs_probe() wifi: atmel: Fix an error handling path in atmel_probe() wl3501_cs: Fix misspelling and provide missing documentation net: create netdev->dev_addr assignment helpers wl3501_cs: use eth_hw_addr_set() wifi: wl3501_cs: Fix an error handling path in wl3501_probe() wifi: ray_cs: Utilize strnlen() in parse_addr() wifi: ray_cs: Drop useless status variable in parse_addr() wifi: ray_cs: Fix an error handling path in ray_probe() wifi: ath9k: don't allow to overwrite ENDPOINT0 attributes wifi: rsi: Do not configure WoWlan in shutdown hook if not enabled wifi: rsi: Do not set MMC_PM_KEEP_POWER in shutdown watchdog/perf: define dummy watchdog_update_hrtimer_threshold() on correct config watchdog/perf: more properly prevent false positives with turbo modes kexec: fix a memory leak in crash_shrink_memory() memstick r592: make memstick_debug_get_tpc_name() static wifi: ath9k: Fix possible stall on ath9k_txq_list_has_key() rtnetlink: extend RTEXT_FILTER_SKIP_STATS to IFLA_VF_INFO wifi: iwlwifi: pull from TXQs with softirqs disabled wifi: cfg80211: rewrite merging of inherited elements wifi: ath9k: convert msecs to jiffies where needed igc: Fix race condition in PTP tx code net: stmmac: fix double serdes powerdown netlink: fix potential deadlock in netlink_set_err() netlink: do not hard code device address lenth in fdb dumps selftests: rtnetlink: remove netdevsim device after ipsec offload test gtp: Fix use-after-free in __gtp_encap_destroy(). net: axienet: Move reset before 64-bit DMA detection sfc: fix crash when reading stats while NIC is resetting nfc: llcp: simplify llcp_sock_connect() error paths net: nfc: Fix use-after-free caused by nfc_llcp_find_local lib/ts_bm: reset initial match offset for every block of text netfilter: conntrack: dccp: copy entire header to stack buffer, not just basic one netfilter: nf_conntrack_sip: fix the ct_sip_parse_numerical_param() return value. ipvlan: Fix return value of ipvlan_queue_xmit() netlink: Add __sock_i_ino() for __netlink_diag_dump(). radeon: avoid double free in ci_dpm_init() drm/amd/display: Explicitly specify update type per plane info change Input: drv260x - sleep between polling GO bit drm/bridge: tc358768: always enable HS video mode drm/bridge: tc358768: fix PLL parameters computation drm/bridge: tc358768: fix PLL target frequency drm/bridge: tc358768: fix TCLK_ZEROCNT computation drm/bridge: tc358768: Add atomic_get_input_bus_fmts() implementation drm/bridge: tc358768: fix TCLK_TRAILCNT computation drm/bridge: tc358768: fix THS_ZEROCNT computation drm/bridge: tc358768: fix TXTAGOCNT computation drm/bridge: tc358768: fix THS_TRAILCNT computation drm/vram-helper: fix function names in vram helper doc ARM: dts: BCM5301X: Drop "clock-names" from the SPI node ARM: dts: meson8b: correct uart_B and uart_C clock references Input: adxl34x - do not hardcode interrupt trigger type drm: sun4i_tcon: use devm_clk_get_enabled in `sun4i_tcon_init_clocks` drm/panel: sharp-ls043t1le01: adjust mode settings ARM: dts: stm32: Move ethernet MAC EEPROM from SoM to carrier boards bus: ti-sysc: Fix dispc quirk masking bool variables arm64: dts: microchip: sparx5: do not use PSCI on reference boards RDMA/bnxt_re: Disable/kill tasklet only if it is enabled RDMA/bnxt_re: Fix to remove unnecessary return labels RDMA/bnxt_re: Use unique names while registering interrupts RDMA/bnxt_re: Remove a redundant check inside bnxt_re_update_gid RDMA/bnxt_re: Fix to remove an unnecessary log ARM: dts: gta04: Move model property out of pinctrl node arm64: dts: qcom: msm8916: correct camss unit address arm64: dts: qcom: msm8994: correct SPMI unit address arm64: dts: qcom: msm8996: correct camss unit address drm/panel: simple: fix active size for Ampire AM-480272H3TMQW-T01H ARM: ep93xx: fix missing-prototype warnings ARM: omap2: fix missing tick_broadcast() prototype arm64: dts: qcom: apq8096: fix fixed regulator name property ARM: dts: stm32: Shorten the AV96 HDMI sound card name memory: brcmstb_dpfe: fix testing array offset after use ASoC: es8316: Increment max value for ALC Capture Target Volume control ASoC: es8316: Do not set rate constraints for unsupported MCLKs ARM: dts: meson8: correct uart_B and uart_C clock references soc/fsl/qe: fix usb.c build errors IB/hfi1: Use bitmap_zalloc() when applicable IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors IB/hfi1: Fix wrong mmu_node used for user SDMA packet after invalidate RDMA: Remove uverbs_ex_cmd_mask values that are linked to functions RDMA/hns: Fix coding style issues RDMA/hns: Use refcount_t APIs for HEM RDMA/hns: Clean the hardware related code for HEM RDMA/hns: Fix hns_roce_table_get return value ARM: dts: iwg20d-q7-common: Fix backlight pwm specifier arm64: dts: renesas: ulcb-kf: Remove flow control for SCIF1 fbdev: omapfb: lcd_mipid: Fix an error handling path in mipid_spi_probe() arm64: dts: ti: k3-j7200: Fix physical address of pin ARM: dts: stm32: Fix audio routing on STM32MP15xx DHCOM PDK2 ARM: dts: stm32: fix i2s endpoint format property for stm32mp15xx-dkx hwmon: (gsc-hwmon) fix fan pwm temperature scaling hwmon: (adm1275) enable adm1272 temperature reporting hwmon: (adm1275) Allow setting sample averaging hwmon: (pmbus/adm1275) Fix problems with temperature monitoring on ADM1272 ARM: dts: BCM5301X: fix duplex-full => full-duplex drm/amdkfd: Fix potential deallocation of previously deallocated memory. drm/radeon: fix possible division-by-zero errors amdgpu: validate offset_in_bo of drm_amdgpu_gem_va RDMA/bnxt_re: wraparound mbox producer index RDMA/bnxt_re: Avoid calling wake_up threads from spin_lock context clk: imx: clk-imx8mn: fix memory leak in imx8mn_clocks_probe clk: imx: clk-imx8mp: improve error handling in imx8mp_clocks_probe() clk: tegra: tegra124-emc: Fix potential memory leak ALSA: ac97: Fix possible NULL dereference in snd_ac97_mixer drm/msm/dpu: do not enable color-management if DSPPs are not available drm/msm/dp: Free resources after unregistering them clk: vc5: check memory returned by kasprintf() clk: cdce925: check return value of kasprintf() clk: si5341: Allow different output VDD_SEL values clk: si5341: Add sysfs properties to allow checking/resetting device faults clk: si5341: return error if one synth clock registration fails clk: si5341: check return value of {devm_}kasprintf() clk: si5341: free unused memory on probe failure clk: keystone: sci-clk: check return value of kasprintf() clk: ti: clkctrl: check return value of kasprintf() drivers: meson: secure-pwrc: always enable DMA domain ovl: update of dentry revalidate flags after copy up ASoC: imx-audmix: check return value of devm_kasprintf() PCI: cadence: Fix Gen2 Link Retraining process scsi: qedf: Fix NULL dereference in error handling pinctrl: bcm2835: Handle gpiochip_add_pin_range() errors PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free scsi: 3w-xxxx: Add error handling for initialization failure in tw_probe() PCI: pciehp: Cancel bringup sequence if card is not present PCI: ftpci100: Release the clock resources PCI: Add pci_clear_master() stub for non-CONFIG_PCI perf bench: Use unbuffered output when pipe/tee'ing to a file perf bench: Add missing setlocale() call to allow usage of %'d style formatting pinctrl: cherryview: Return correct value if pin in push-pull mode kcsan: Don't expect 64 bits atomic builtins from 32 bits architectures perf script: Fixup 'struct evsel_script' method prefix perf script: Fix allocation of evsel->priv related to per-event dump files perf dwarf-aux: Fix off-by-one in die_get_varname() pinctrl: at91-pio4: check return value of devm_kasprintf() powerpc/powernv/sriov: perform null check on iov before dereferencing iov mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t * mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t * powerpc/book3s64/mm: Fix DirectMap stats in /proc/meminfo powerpc/mm/dax: Fix the condition when checking if altmap vmemap can cross-boundary hwrng: virtio - add an internal buffer hwrng: virtio - don't wait on cleanup hwrng: virtio - don't waste entropy hwrng: virtio - always add a pending request hwrng: virtio - Fix race on data_avail and actual data crypto: nx - fix build warnings when DEBUG_FS is not enabled modpost: fix section mismatch message for R_ARM_ABS32 modpost: fix section mismatch message for R_ARM_{PC24,CALL,JUMP24} crypto: marvell/cesa - Fix type mismatch warning modpost: fix off by one in is_executable_section() ARC: define ASM_NL and __ALIGN(_STR) outside #ifdef __ASSEMBLY__ guard NFSv4.1: freeze the session table upon receiving NFS4ERR_BADSESSION dax: Fix dax_mapping_release() use after free dax: Introduce alloc_dev_dax_id() hwrng: st - keep clock enabled while hwrng is registered io_uring: ensure IOPOLL locks around deferred work USB: serial: option: add LARA-R6 01B PIDs usb: dwc3: gadget: Propagate core init errors to UDC during pullup phy: tegra: xusb: Clear the driver reference in usb-phy dev block: fix signed int overflow in Amiga partition support block: change all __u32 annotations to __be32 in affs_hardblocks.h SUNRPC: Fix UAF in svc_tcp_listen_data_ready() w1: w1_therm: fix locking behavior in convert_t w1: fix loop in w1_fini() sh: j2: Use ioremap() to translate device tree address into kernel memory serial: 8250: omap: Fix freeing of resources on failed register clk: qcom: gcc-ipq6018: Use floor ops for sdcc clocks media: usb: Check az6007_read() return value media: videodev2.h: Fix struct v4l2_input tuner index comment media: usb: siano: Fix warning due to null work_func_t function pointer clk: qcom: reset: Allow specifying custom reset delay clk: qcom: reset: support resetting multiple bits clk: qcom: ipq6018: fix networking resets usb: dwc3: qcom: Fix potential memory leak usb: gadget: u_serial: Add null pointer check in gserial_suspend extcon: Fix kernel doc of property fields to avoid warnings extcon: Fix kernel doc of property capability fields to avoid warnings usb: phy: phy-tahvo: fix memory leak in tahvo_usb_probe() usb: hide unused usbfs_notify_suspend/resume functions serial: 8250: lock port for stop_rx() in omap8250_irq() serial: 8250: lock port for UART_IER access in omap8250_irq() kernfs: fix missing kernfs_idr_lock to remove an ID from the IDR coresight: Fix loss of connection info when a module is unloaded mfd: rt5033: Drop rt5033-battery sub-device media: venus: helpers: Fix ALIGN() of non power of two media: atomisp: gmin_platform: fix out_len in gmin_get_config_dsm_var() KVM: s390: fix KVM_S390_GET_CMMA_BITS for GFNs in memslot holes usb: dwc3: qcom: Release the correct resources in dwc3_qcom_remove() usb: dwc3: qcom: Fix an error handling path in dwc3_qcom_probe() usb: common: usb-conn-gpio: Set last role to unknown before initial detection usb: dwc3-meson-g12a: Fix an error handling path in dwc3_meson_g12a_probe() mfd: intel-lpss: Add missing check for platform_get_resource Revert "usb: common: usb-conn-gpio: Set last role to unknown before initial detection" serial: 8250_omap: Use force_suspend and resume for system suspend test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation mfd: stmfx: Fix error path in stmfx_chip_init mfd: stmfx: Nullify stmfx->vdd in case of error KVM: s390: vsie: fix the length of APCB bitmap mfd: stmpe: Only disable the regulators if they are enabled phy: tegra: xusb: check return value of devm_kzalloc() pwm: imx-tpm: force 'real_period' to be zero in suspend pwm: sysfs: Do not apply state to already disabled PWMs rtc: st-lpc: Release some resources in st_rtc_probe() in case of error media: cec: i2c: ch7322: also select REGMAP sctp: fix potential deadlock on &net->sctp.addr_wq_lock Add MODULE_FIRMWARE() for FIRMWARE_TG357766. net: dsa: vsc73xx: fix MTU configuration spi: bcm-qspi: return error if neither hif_mspi nor mspi is available mailbox: ti-msgmgr: Fill non-message tx data fields with 0x0 f2fs: fix error path handling in truncate_dnode() octeontx2-af: Fix mapping for NIX block from CGX connection powerpc: allow PPC_EARLY_DEBUG_CPM only when SERIAL_CPM=y net: bridge: keep ports without IFF_UNICAST_FLT in BR_PROMISC mode tcp: annotate data races in __tcp_oow_rate_limited() xsk: Honor SO_BINDTODEVICE on bind net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX pptp: Fix fib lookup calls. net: dsa: tag_sja1105: fix MAC DA patching from meta frames s390/qeth: Fix vipa deletion sh: dma: Fix DMA channel offset calculation apparmor: fix missing error check for rhashtable_insert_fast i2c: xiic: Defer xiic_wakeup() and __xiic_start_xfer() in xiic_process() i2c: xiic: Don't try to handle more interrupt events after error ALSA: jack: Fix mutex call in snd_jack_report() i2c: qup: Add missing unwind goto in qup_i2c_probe() NFSD: add encoding of op_recall flag for write delegation io_uring: wait interruptibly for request completions on exit mmc: core: disable TRIM on Kingston EMMC04G-M627 mmc: core: disable TRIM on Micron MTFC4GACAJCN-1M mmc: mmci: Set PROBE_PREFER_ASYNCHRONOUS mmc: sdhci: fix DMA configure compatibility issue when 64bit DMA mode is used. bcache: fixup btree_cache_wait list damage bcache: Remove unnecessary NULL point check in node allocations bcache: Fix __bch_btree_node_alloc to make the failure behavior consistent um: Use HOST_DIR for mrproper integrity: Fix possible multiple allocation in integrity_inode_get() autofs: use flexible array in ioctl structure shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfs jffs2: reduce stack usage in jffs2_build_xattr_subsystem() fs: avoid empty option when generating legacy mount string ext4: Remove ext4 locking of moved directory Revert "f2fs: fix potential corruption when moving a directory" fs: Establish locking order for unrelated directories fs: Lock moved directories btrfs: add handling for RAID1C23/DUP to btrfs_reduce_alloc_profile btrfs: fix race when deleting quota root from the dirty cow roots list ASoC: mediatek: mt8173: Fix irq error path ASoC: mediatek: mt8173: Fix snd_soc_component_initialize error path ARM: orion5x: fix d2net gpio initialization leds: trigger: netdev: Recheck NETDEV_LED_MODE_LINKUP on dev rename fs: no need to check source fanotify: disallow mount/sb marks on kernel internal pseudo fs tpm, tpm_tis: Claim locality in interrupt handler selftests/bpf: Add verifier test for PTR_TO_MEM spill block: add overflow checks for Amiga partition support sh: pgtable-3level: Fix cast to pointer from integer of different size netfilter: nf_tables: use net_generic infra for transaction data netfilter: nf_tables: add rescheduling points during loop detection walks netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE netfilter: nf_tables: fix chain binding transaction logic netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nftables: rename set element data activation/deactivation functions netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: unbind non-anonymous set if rule construction fails netfilter: nf_tables: fix scheduling-while-atomic splat netfilter: conntrack: Avoid nf_ct_helper_hash uses after free netfilter: nf_tables: do not ignore genmask when looking up chain by id netfilter: nf_tables: prevent OOB access in nft_byteorder_eval wireguard: queueing: use saner cpu selection wrapping wireguard: netlink: send staged packets when setting initial private key tty: serial: fsl_lpuart: add earlycon for imx8ulp platform rcu-tasks: Mark ->trc_reader_nesting data races rcu-tasks: Mark ->trc_reader_special.b.need_qs data races rcu-tasks: Simplify trc_read_check_handler() atomic operations block/partition: fix signedness issue for Amiga partitions io_uring: Use io_schedule* in cqring wait io_uring: add reschedule point to handle_tw_list() net: lan743x: Don't sleep in atomic context workqueue: clean up WORK_* constant types, clarify masking drm/panel: simple: Add connector_type for innolux_at043tn24 drm/panel: simple: Add Powertip PH800480T013 drm_display_mode flags igc: Remove delay during TX ring configuration net/mlx5e: fix double free in mlx5e_destroy_flow_table net/mlx5e: Check for NOT_READY flag state after locking igc: set TP bit in 'supported' and 'advertising' fields of ethtool_link_ksettings scsi: qla2xxx: Fix error code in qla2x00_start_sp() net: mvneta: fix txq_map in case of txq_number==1 net/sched: cls_fw: Fix improper refcount update leads to use-after-free gve: Set default duplex configuration to full ionic: remove WARN_ON to prevent panic_on_warn net: bgmac: postpone turning IRQs off to avoid SoC hangs net: prevent skb corruption on frag list segmentation icmp6: Fix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev(). udp6: fix udp6_ehashfn() typo ntb: idt: Fix error handling in idt_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: ntb_transport: fix possible memory leak while device_register() fails NTB: ntb_tool: Add check for devm_kcalloc ipv6/addrconf: fix a potential refcount underflow for idev platform/x86: wmi: remove unnecessary argument platform/x86: wmi: use guid_t and guid_equal() platform/x86: wmi: move variables platform/x86: wmi: Break possible infinite loop when parsing GUID igc: Fix launchtime before start of cycle igc: Fix inserting of empty frame for launchtime riscv: bpf: Move bpf_jit_alloc_exec() and bpf_jit_free_exec() to core riscv: bpf: Avoid breaking W^X bpf, riscv: Support riscv jit to provide bpf_line_info riscv, bpf: Fix inconsistent JIT image generation erofs: avoid infinite loop in z_erofs_do_read_page() when reading beyond EOF wifi: airo: avoid uninitialized warning in airo_get_rate() net/sched: flower: Ensure both minimum and maximum ports are specified netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write() net/sched: make psched_mtu() RTNL-less safe net/sched: sch_qfq: refactor parsing of netlink parameters net/sched: sch_qfq: account for stab overhead in qfq_enqueue nvme-pci: fix DMA direction of unmapping integrity data f2fs: fix to avoid NULL pointer dereference f2fs_write_end_io() pinctrl: amd: Fix mistake in handling clearing pins at startup pinctrl: amd: Detect internal GPIO0 debounce handling pinctrl: amd: Only use special debounce behavior for GPIO 0 tpm: tpm_vtpm_proxy: fix a race condition in /dev/vtpmx creation mtd: rawnand: meson: fix unaligned DMA buffers handling net: bcmgenet: Ensure MDIO unregistration has clocks enabled powerpc: Fail build if using recordmcount with binutils v2.37 misc: fastrpc: Create fastrpc scalar with correct buffer count erofs: fix compact 4B support for 16k block size MIPS: Loongson: Fix cpu_probe_loongson() again ext4: Fix reusing stale buffer heads from last failed mounting ext4: fix wrong unit use in ext4_mb_clear_bb ext4: get block from bh in ext4_free_blocks for fast commit replay ext4: fix wrong unit use in ext4_mb_new_blocks ext4: only update i_reserved_data_blocks on successful block allocation jfs: jfs_dmap: Validate db_l2nbperpage while mounting hwrng: imx-rngc - fix the timeout for init and self check PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold PCI: Add function 1 DMA alias quirk for Marvell 88SE9235 PCI: qcom: Disable write access to read only registers for IP v2.3.3 PCI: rockchip: Assert PCI Configuration Enable bit after probe PCI: rockchip: Write PCI Device ID to correct register PCI: rockchip: Add poll and timeout to wait for PHY PLLs to be locked PCI: rockchip: Fix legacy IRQ generation for RK3399 PCIe endpoint core PCI: rockchip: Use u32 variable to access 32-bit registers PCI: rockchip: Set address alignment for endpoint mode misc: pci_endpoint_test: Free IRQs before removing the device misc: pci_endpoint_test: Re-init completion for every test md/raid0: add discard support for the 'original' layout fs: dlm: return positive pid value for F_GETLK drm/atomic: Allow vblank-enabled + self-refresh "disable" drm/rockchip: vop: Leave vblank enabled in self-refresh drm/amd/display: Correct `DMUB_FW_VERSION` macro serial: atmel: don't enable IRQs prematurely tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() in case of error tty: serial: samsung_tty: Fix a memory leak in s3c24xx_serial_getclk() when iterating clk firmware: stratix10-svc: Fix a potential resource leak in svc_create_memory_pool() ceph: don't let check_caps skip sending responses for revoke msgs xhci: Fix resume issue of some ZHAOXIN hosts xhci: Fix TRB prefetch issue of ZHAOXIN hosts xhci: Show ZHAOXIN xHCI root hub speed correctly meson saradc: fix clock divider mask length Revert "8250: add support for ASIX devices with a FIFO bug" s390/decompressor: fix misaligned symbol build error tracing/histograms: Add histograms to hist_vars if they have referenced variables samples: ftrace: Save required argument registers in sample trampolines net: ena: fix shift-out-of-bounds in exponential backoff ring-buffer: Fix deadloop issue on reading trace_pipe xtensa: ISS: fix call to split_if_spec tracing: Fix null pointer dereference in tracing_err_log_open() tracing/probes: Fix not to count error code to total length scsi: qla2xxx: Wait for io return on terminate rport scsi: qla2xxx: Array index may go out of bound scsi: qla2xxx: Fix buffer overrun scsi: qla2xxx: Fix potential NULL pointer dereference scsi: qla2xxx: Check valid rport returned by fc_bsg_to_rport() scsi: qla2xxx: Correct the index of array scsi: qla2xxx: Pointer may be dereferenced scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue net/sched: sch_qfq: reintroduce lmax bound check for MTU RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests drm/atomic: Fix potential use-after-free in nonblocking commits ALSA: hda/realtek - remove 3k pull low procedure ALSA: hda/realtek: Enable Mute LED on HP Laptop 15s-eq2xxx keys: Fix linking a duplicate key to a keyring's assoc_array perf probe: Add test for regression introduced by switch to die_get_decl_file() btrfs: fix warning when putting transaction with qgroups enabled after abort fuse: revalidate: don't invalidate if interrupted selftests: tc: set timeout to 15 minutes selftests: tc: add 'ct' action kconfig dep regmap: Drop initial version of maximum transfer length fixes regmap: Account for register length in SMBus I/O limits can: bcm: Fix UAF in bcm_proc_show() drm/client: Fix memory leak in drm_client_target_cloned drm/client: Fix memory leak in drm_client_modeset_probe ASoC: fsl_sai: Disable bit clock with transmitter ext4: correct inline offset when handling xattrs in inode body debugobjects: Recheck debug_objects_enabled before reporting nbd: Add the maximum limit of allocated index in nbd_dev_add md: fix data corruption for raid456 when reshape restart while grow up md/raid10: prevent soft lockup while flush writes posix-timers: Ensure timer ID search-loop limit is valid btrfs: add xxhash to fast checksum implementations ACPI: button: Add lid disable DMI quirk for Nextbook Ares 8A ACPI: video: Add backlight=native DMI quirk for Apple iMac11,3 ACPI: video: Add backlight=native DMI quirk for Lenovo ThinkPad X131e (3371 AMD version) arm64: set __exception_irq_entry with __irq_entry as a default arm64: mm: fix VA-range sanity check sched/fair: Don't balance task to its current running CPU wifi: ath11k: fix registration of 6Ghz-only phy without the full channel range bpf: Address KCSAN report on bpf_lru_list devlink: report devlink_port_type_warn source device wifi: wext-core: Fix -Wstringop-overflow warning in ioctl_standard_iw_point() wifi: iwlwifi: mvm: avoid baid size integer overflow igb: Fix igb_down hung on surprise removal spi: bcm63xx: fix max prepend length fbdev: imxfb: warn about invalid left/right margin pinctrl: amd: Use amd_pinconf_set() for all config options net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field() bridge: Add extack warning when enabling STP in netns. iavf: Fix use-after-free in free_netdev iavf: Fix out-of-bounds when setting channels on remove security: keys: Modify mismatched function name octeontx2-pf: Dont allocate BPIDs for LBK interfaces tcp: annotate data-races around tcp_rsk(req)->ts_recent net: ipv4: Use kfree_sensitive instead of kfree net:ipv6: check return value of pskb_trim() Revert "tcp: avoid the lookup process failing to get sk in ehash table" fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe llc: Don't drop packet from non-root netns. netfilter: nf_tables: fix spurious set element insertion failure netfilter: nf_tables: can't schedule in nft_chain_validate netfilter: nft_set_pipapo: fix improper element removal netfilter: nf_tables: skip bound chain in netns release path netfilter: nf_tables: skip bound chain on rule flush tcp: annotate data-races around tp->tcp_tx_delay tcp: annotate data-races around tp->keepalive_time tcp: annotate data-races around tp->keepalive_intvl tcp: annotate data-races around tp->keepalive_probes net: Introduce net.ipv4.tcp_migrate_req. tcp: Fix data-races around sysctl_tcp_syn(ack)?_retries. tcp: annotate data-races around icsk->icsk_syn_retries tcp: annotate data-races around tp->linger2 tcp: annotate data-races around rskq_defer_accept tcp: annotate data-races around tp->notsent_lowat tcp: annotate data-races around icsk->icsk_user_timeout tcp: annotate data-races around fastopenq.max_qlen net: phy: prevent stale pointer dereference in phy_init() tracing/histograms: Return an error if we fail to add histogram to hist_vars list tracing: Fix memory leak of iter->temp when reading trace_pipe ftrace: Store the order of pages allocated in ftrace_page ftrace: Fix possible warning on checking all pages used in ftrace_process_locs() Linux 5.10.188 Change-Id: Ibcc1adc43df5b8f649b12078eedd5d4f57de4578 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> |
||
![]() |
937105d2b0 |
tcp: annotate data-races around tcp_rsk(req)->ts_recent
[ Upstream commit eba20811f32652bc1a52d5e7cc403859b86390d9 ]
TCP request sockets are lockless, tcp_rsk(req)->ts_recent
can change while being read by another cpu as syzbot noticed.
This is harmless, but we should annotate the known races.
Note that tcp_check_req() changes req->ts_recent a bit early,
we might change this in the future.
BUG: KCSAN: data-race in tcp_check_req / tcp_check_req
write to 0xffff88813c8afb84 of 4 bytes by interrupt on cpu 1:
tcp_check_req+0x694/0xc70 net/ipv4/tcp_minisocks.c:762
tcp_v4_rcv+0x12db/0x1b70 net/ipv4/tcp_ipv4.c:2071
ip_protocol_deliver_rcu+0x356/0x6d0 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x13c/0x1a0 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:303 [inline]
ip_local_deliver+0xec/0x1c0 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:468 [inline]
ip_rcv_finish net/ipv4/ip_input.c:449 [inline]
NF_HOOK include/linux/netfilter.h:303 [inline]
ip_rcv+0x197/0x270 net/ipv4/ip_input.c:569
__netif_receive_skb_one_core net/core/dev.c:5493 [inline]
__netif_receive_skb+0x90/0x1b0 net/core/dev.c:5607
process_backlog+0x21f/0x380 net/core/dev.c:5935
__napi_poll+0x60/0x3b0 net/core/dev.c:6498
napi_poll net/core/dev.c:6565 [inline]
net_rx_action+0x32b/0x750 net/core/dev.c:6698
__do_softirq+0xc1/0x265 kernel/softirq.c:571
do_softirq+0x7e/0xb0 kernel/softirq.c:472
__local_bh_enable_ip+0x64/0x70 kernel/softirq.c:396
local_bh_enable+0x1f/0x20 include/linux/bottom_half.h:33
rcu_read_unlock_bh include/linux/rcupdate.h:843 [inline]
__dev_queue_xmit+0xabb/0x1d10 net/core/dev.c:4271
dev_queue_xmit include/linux/netdevice.h:3088 [inline]
neigh_hh_output include/net/neighbour.h:528 [inline]
neigh_output include/net/neighbour.h:542 [inline]
ip_finish_output2+0x700/0x840 net/ipv4/ip_output.c:229
ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:317
NF_HOOK_COND include/linux/netfilter.h:292 [inline]
ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:431
dst_output include/net/dst.h:458 [inline]
ip_local_out net/ipv4/ip_output.c:126 [inline]
__ip_queue_xmit+0xa4d/0xa70 net/ipv4/ip_output.c:533
ip_queue_xmit+0x38/0x40 net/ipv4/ip_output.c:547
__tcp_transmit_skb+0x1194/0x16e0 net/ipv4/tcp_output.c:1399
tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline]
tcp_write_xmit+0x13ff/0x2fd0 net/ipv4/tcp_output.c:2693
__tcp_push_pending_frames+0x6a/0x1a0 net/ipv4/tcp_output.c:2877
tcp_push_pending_frames include/net/tcp.h:1952 [inline]
__tcp_sock_set_cork net/ipv4/tcp.c:3336 [inline]
tcp_sock_set_cork+0xe8/0x100 net/ipv4/tcp.c:3343
rds_tcp_xmit_path_complete+0x3b/0x40 net/rds/tcp_send.c:52
rds_send_xmit+0xf8d/0x1420 net/rds/send.c:422
rds_send_worker+0x42/0x1d0 net/rds/threads.c:200
process_one_work+0x3e6/0x750 kernel/workqueue.c:2408
worker_thread+0x5f2/0xa10 kernel/workqueue.c:2555
kthread+0x1d7/0x210 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
read to 0xffff88813c8afb84 of 4 bytes by interrupt on cpu 0:
tcp_check_req+0x32a/0xc70 net/ipv4/tcp_minisocks.c:622
tcp_v4_rcv+0x12db/0x1b70 net/ipv4/tcp_ipv4.c:2071
ip_protocol_deliver_rcu+0x356/0x6d0 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x13c/0x1a0 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:303 [inline]
ip_local_deliver+0xec/0x1c0 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:468 [inline]
ip_rcv_finish net/ipv4/ip_input.c:449 [inline]
NF_HOOK include/linux/netfilter.h:303 [inline]
ip_rcv+0x197/0x270 net/ipv4/ip_input.c:569
__netif_receive_skb_one_core net/core/dev.c:5493 [inline]
__netif_receive_skb+0x90/0x1b0 net/core/dev.c:5607
process_backlog+0x21f/0x380 net/core/dev.c:5935
__napi_poll+0x60/0x3b0 net/core/dev.c:6498
napi_poll net/core/dev.c:6565 [inline]
net_rx_action+0x32b/0x750 net/core/dev.c:6698
__do_softirq+0xc1/0x265 kernel/softirq.c:571
run_ksoftirqd+0x17/0x20 kernel/softirq.c:939
smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
kthread+0x1d7/0x210 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
value changed: 0x1cd237f1 -> 0x1cd237f2
Fixes:
|
||
![]() |
08f61a3491 |
bpf: Remove extra lock_sock for TCP_ZEROCOPY_RECEIVE
[ Upstream commit 9cacf81f8161111db25f98e78a7a0e32ae142b3f ] Add custom implementation of getsockopt hook for TCP_ZEROCOPY_RECEIVE. We skip generic hooks for TCP_ZEROCOPY_RECEIVE and have a custom call in do_tcp_getsockopt using the on-stack data. This removes 3% overhead for locking/unlocking the socket. Without this patch: 3.38% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt | --3.30%--__cgroup_bpf_run_filter_getsockopt | --0.81%--__kmalloc With the patch applied: 0.52% 0.12% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt_kern Note, exporting uapi/tcp.h requires removing netinet/tcp.h from test_progs.h because those headers have confliciting definitions. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210115163501.805133-2-sdf@google.com Stable-dep-of: 2598619e012c ("sctp: add bpf_bypass_getsockopt proto callback") Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
![]() |
a4be51e26a |
Revert "net: Find dst with sk's xfrm policy not ctl_sk"
This reverts commit
|
||
![]() |
788791990d |
net: Find dst with sk's xfrm policy not ctl_sk
[ Upstream commit e22aa14866684f77b4f6b6cae98539e520ddb731 ] If we set XFRM security policy by calling setsockopt with option IPV6_XFRM_POLICY, the policy will be stored in 'sock_policy' in 'sock' struct. However tcp_v6_send_response doesn't look up dst_entry with the actual socket but looks up with tcp control socket. This may cause a problem that a RST packet is sent without ESP encryption & peer's TCP socket can't receive it. This patch will make the function look up dest_entry with actual socket, if the socket has XFRM policy(sock_policy), so that the TCP response packet via this function can be encrypted, & aligned on the encrypted TCP socket. Tested: We encountered this problem when a TCP socket which is encrypted in ESP transport mode encryption, receives challenge ACK at SYN_SENT state. After receiving challenge ACK, TCP needs to send RST to establish the socket at next SYN try. But the RST was not encrypted & peer TCP socket still remains on ESTABLISHED state. So we verified this with test step as below. [Test step] 1. Making a TCP state mismatch between client(IDLE) & server(ESTABLISHED). 2. Client tries a new connection on the same TCP ports(src & dst). 3. Server will return challenge ACK instead of SYN,ACK. 4. Client will send RST to server to clear the SOCKET. 5. Client will retransmit SYN to server on the same TCP ports. [Expected result] The TCP connection should be established. Cc: Maciej Żenczykowski <maze@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Sehee Lee <seheele@google.com> Signed-off-by: Sewook Seo <sewookseo@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: 1e306ec49a1f ("tcp: fix possible sk_priority leak in tcp_v4_send_reset()") Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
![]() |
04d393c4bb |
inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy().
commit b5fc29233d28be7a3322848ebe73ac327559cdb9 upstream. After commit d38afeec26ed ("tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct()."), we call inet6_destroy_sock() in sk->sk_destruct() by setting inet6_sock_destruct() to it to make sure we do not leak inet6-specific resources. Now we can remove unnecessary inet6_destroy_sock() calls in sk->sk_prot->destroy(). DCCP and SCTP have their own sk->sk_destruct() function, so we change them separately in the following patches. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
![]() |
7546fb3554 |
ipv6: Fix tcp socket connection with DSCP.
commit 8230680f36fd1525303d1117768c8852314c488c upstream.
Take into account the IPV6_TCLASS socket option (DSCP) in
tcp_v6_connect(). Otherwise fib6_rule_match() can't properly
match the DSCP value, resulting in invalid route lookup.
For example:
ip route add unreachable table main 2001:db8::10/124
ip route add table 100 2001:db8::10/124 dev eth0
ip -6 rule add dsfield 0x04 table 100
echo test | socat - TCP6:[2001:db8::11]:54321,ipv6-tclass=0x04
Without this patch, socat fails at connect() time ("No route to host")
because the fib-rule doesn't jump to table 100 and the lookup ends up
being done in the main table.
Fixes:
|
||
![]() |
9d68bfa220 |
dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions.
commit ca43ccf41224b023fc290073d5603a755fd12eed upstream. Eric Dumazet pointed out [0] that when we call skb_set_owner_r() for ipv6_pinfo.pktoptions, sk_rmem_schedule() has not been called, resulting in a negative sk_forward_alloc. We add a new helper which clones a skb and sets its owner only when sk_rmem_schedule() succeeds. Note that we move skb_set_owner_r() forward in (dccp|tcp)_v6_do_rcv() because tcp_send_synack() can make sk_forward_alloc negative before ipv6_opt_accepted() in the crossed SYN-ACK or self-connect() cases. [0]: https://lore.kernel.org/netdev/CANn89iK9oc20Jdi_41jb9URdF210r7d1Y-+uypbMSbOfY6jqrg@mail.gmail.com/ Fixes: |
||
![]() |
c0af4d005a |
dccp/tcp: Reset saddr on failure after inet6?_hash_connect().
[ Upstream commit 77934dc6db0d2b111a8f2759e9ad2fb67f5cffa5 ] When connect() is called on a socket bound to the wildcard address, we change the socket's saddr to a local address. If the socket fails to connect() to the destination, we have to reset the saddr. However, when an error occurs after inet_hash6?_connect() in (dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave the socket bound to the address. From the user's point of view, whether saddr is reset or not varies with errno. Let's fix this inconsistent behaviour. Note that after this patch, the repro [0] will trigger the WARN_ON() in inet_csk_get_port() again, but this patch is not buggy and rather fixes a bug papering over the bhash2's bug for which we need another fix. For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect() by this sequence: s1 = socket() s1.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1) s1.bind(('127.0.0.1', 10000)) s1.sendto(b'hello', MSG_FASTOPEN, (('127.0.0.1', 10000))) # or s1.connect(('127.0.0.1', 10000)) s2 = socket() s2.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1) s2.bind(('0.0.0.0', 10000)) s2.connect(('127.0.0.1', 10000)) # -EADDRNOTAVAIL s2.listen(32) # WARN_ON(inet_csk(sk)->icsk_bind2_hash != tb2); [0]: https://syzkaller.appspot.com/bug?extid=015d756bbd1f8b5c8f09 Fixes: |
||
![]() |
2bf33b5ea4 |
tcp/udp: Make early_demux back namespacified.
commit 11052589cf5c0bab3b4884d423d5f60c38fcf25d upstream. Commit |
||
![]() |
f039b43cba |
inet: fully convert sk->sk_rx_dst to RCU rules
commit 8f905c0e7354ef261360fb7535ea079b1082c105 upstream.
syzbot reported various issues around early demux,
one being included in this changelog [1]
sk->sk_rx_dst is using RCU protection without clearly
documenting it.
And following sequences in tcp_v4_do_rcv()/tcp_v6_do_rcv()
are not following standard RCU rules.
[a] dst_release(dst);
[b] sk->sk_rx_dst = NULL;
They look wrong because a delete operation of RCU protected
pointer is supposed to clear the pointer before
the call_rcu()/synchronize_rcu() guarding actual memory freeing.
In some cases indeed, dst could be freed before [b] is done.
We could cheat by clearing sk_rx_dst before calling
dst_release(), but this seems the right time to stick
to standard RCU annotations and debugging facilities.
[1]
BUG: KASAN: use-after-free in dst_check include/net/dst.h:470 [inline]
BUG: KASAN: use-after-free in tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
Read of size 2 at addr ffff88807f1cb73a by task syz-executor.5/9204
CPU: 0 PID: 9204 Comm: syz-executor.5 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description.constprop.0.cold+0x8d/0x320 mm/kasan/report.c:247
__kasan_report mm/kasan/report.c:433 [inline]
kasan_report.cold+0x83/0xdf mm/kasan/report.c:450
dst_check include/net/dst.h:470 [inline]
tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
ip_rcv_finish_core.constprop.0+0x15de/0x1e80 net/ipv4/ip_input.c:340
ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
__netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
__netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
__netif_receive_skb_list net/core/dev.c:5608 [inline]
netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
gro_normal_list net/core/dev.c:5853 [inline]
gro_normal_list net/core/dev.c:5849 [inline]
napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
__napi_poll+0xaf/0x440 net/core/dev.c:7023
napi_poll net/core/dev.c:7090 [inline]
net_rx_action+0x801/0xb40 net/core/dev.c:7177
__do_softirq+0x29b/0x9c2 kernel/softirq.c:558
invoke_softirq kernel/softirq.c:432 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240
asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:629
RIP: 0033:0x7f5e972bfd57
Code: 39 d1 73 14 0f 1f 80 00 00 00 00 48 8b 50 f8 48 83 e8 08 48 39 ca 77 f3 48 39 c3 73 3e 48 89 13 48 8b 50 f8 48 89 38 49 8b 0e <48> 8b 3e 48 83 c3 08 48 83 c6 08 eb bc 48 39 d1 72 9e 48 39 d0 73
RSP: 002b:00007fff8a413210 EFLAGS: 00000283
RAX: 00007f5e97108990 RBX: 00007f5e97108338 RCX: ffffffff81d3aa45
RDX: ffffffff81d3aa45 RSI: 00007f5e97108340 RDI: ffffffff81d3aa45
RBP: 00007f5e97107eb8 R08: 00007f5e97108d88 R09: 0000000093c2e8d9
R10: 0000000000000000 R11: 0000000000000000 R12: 00007f5e97107eb0
R13: 00007f5e97108338 R14: 00007f5e97107ea8 R15: 0000000000000019
</TASK>
Allocated by task 13:
kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
kasan_set_track mm/kasan/common.c:46 [inline]
set_alloc_info mm/kasan/common.c:434 [inline]
__kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:467
kasan_slab_alloc include/linux/kasan.h:259 [inline]
slab_post_alloc_hook mm/slab.h:519 [inline]
slab_alloc_node mm/slub.c:3234 [inline]
slab_alloc mm/slub.c:3242 [inline]
kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3247
dst_alloc+0x146/0x1f0 net/core/dst.c:92
rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
ip_route_input_slow+0x1817/0x3a20 net/ipv4/route.c:2340
ip_route_input_rcu net/ipv4/route.c:2470 [inline]
ip_route_input_noref+0x116/0x2a0 net/ipv4/route.c:2415
ip_rcv_finish_core.constprop.0+0x288/0x1e80 net/ipv4/ip_input.c:354
ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
__netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
__netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
__netif_receive_skb_list net/core/dev.c:5608 [inline]
netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
gro_normal_list net/core/dev.c:5853 [inline]
gro_normal_list net/core/dev.c:5849 [inline]
napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
__napi_poll+0xaf/0x440 net/core/dev.c:7023
napi_poll net/core/dev.c:7090 [inline]
net_rx_action+0x801/0xb40 net/core/dev.c:7177
__do_softirq+0x29b/0x9c2 kernel/softirq.c:558
Freed by task 13:
kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
kasan_set_track+0x21/0x30 mm/kasan/common.c:46
kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370
____kasan_slab_free mm/kasan/common.c:366 [inline]
____kasan_slab_free mm/kasan/common.c:328 [inline]
__kasan_slab_free+0xff/0x130 mm/kasan/common.c:374
kasan_slab_free include/linux/kasan.h:235 [inline]
slab_free_hook mm/slub.c:1723 [inline]
slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1749
slab_free mm/slub.c:3513 [inline]
kmem_cache_free+0xbd/0x5d0 mm/slub.c:3530
dst_destroy+0x2d6/0x3f0 net/core/dst.c:127
rcu_do_batch kernel/rcu/tree.c:2506 [inline]
rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2741
__do_softirq+0x29b/0x9c2 kernel/softirq.c:558
Last potentially related work creation:
kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
__kasan_record_aux_stack+0xf5/0x120 mm/kasan/generic.c:348
__call_rcu kernel/rcu/tree.c:2985 [inline]
call_rcu+0xb1/0x740 kernel/rcu/tree.c:3065
dst_release net/core/dst.c:177 [inline]
dst_release+0x79/0xe0 net/core/dst.c:167
tcp_v4_do_rcv+0x612/0x8d0 net/ipv4/tcp_ipv4.c:1712
sk_backlog_rcv include/net/sock.h:1030 [inline]
__release_sock+0x134/0x3b0 net/core/sock.c:2768
release_sock+0x54/0x1b0 net/core/sock.c:3300
tcp_sendmsg+0x36/0x40 net/ipv4/tcp.c:1441
inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:724
sock_write_iter+0x289/0x3c0 net/socket.c:1057
call_write_iter include/linux/fs.h:2162 [inline]
new_sync_write+0x429/0x660 fs/read_write.c:503
vfs_write+0x7cd/0xae0 fs/read_write.c:590
ksys_write+0x1ee/0x250 fs/read_write.c:643
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff88807f1cb700
which belongs to the cache ip_dst_cache of size 176
The buggy address is located 58 bytes inside of
176-byte region [ffff88807f1cb700, ffff88807f1cb7b0)
The buggy address belongs to the page:
page:ffffea0001fc72c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7f1cb
flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000200 dead000000000100 dead000000000122 ffff8881413bb780
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 5, ts 108466983062, free_ts 108048976062
prep_new_page mm/page_alloc.c:2418 [inline]
get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4149
__alloc_pages+0x1b2/0x500 mm/page_alloc.c:5369
alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191
alloc_slab_page mm/slub.c:1793 [inline]
allocate_slab mm/slub.c:1930 [inline]
new_slab+0x32d/0x4a0 mm/slub.c:1993
___slab_alloc+0x918/0xfe0 mm/slub.c:3022
__slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109
slab_alloc_node mm/slub.c:3200 [inline]
slab_alloc mm/slub.c:3242 [inline]
kmem_cache_alloc+0x35c/0x3a0 mm/slub.c:3247
dst_alloc+0x146/0x1f0 net/core/dst.c:92
rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
__mkroute_output net/ipv4/route.c:2564 [inline]
ip_route_output_key_hash_rcu+0x921/0x2d00 net/ipv4/route.c:2791
ip_route_output_key_hash+0x18b/0x300 net/ipv4/route.c:2619
__ip_route_output_key include/net/route.h:126 [inline]
ip_route_output_flow+0x23/0x150 net/ipv4/route.c:2850
ip_route_output_key include/net/route.h:142 [inline]
geneve_get_v4_rt+0x3a6/0x830 drivers/net/geneve.c:809
geneve_xmit_skb drivers/net/geneve.c:899 [inline]
geneve_xmit+0xc4a/0x3540 drivers/net/geneve.c:1082
__netdev_start_xmit include/linux/netdevice.h:4994 [inline]
netdev_start_xmit include/linux/netdevice.h:5008 [inline]
xmit_one net/core/dev.c:3590 [inline]
dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3606
__dev_queue_xmit+0x299a/0x3650 net/core/dev.c:4229
page last free stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1338 [inline]
free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1389
free_unref_page_prepare mm/page_alloc.c:3309 [inline]
free_unref_page+0x19/0x690 mm/page_alloc.c:3388
qlink_free mm/kasan/quarantine.c:146 [inline]
qlist_free_all+0x5a/0xc0 mm/kasan/quarantine.c:165
kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:272
__kasan_slab_alloc+0xa2/0xc0 mm/kasan/common.c:444
kasan_slab_alloc include/linux/kasan.h:259 [inline]
slab_post_alloc_hook mm/slab.h:519 [inline]
slab_alloc_node mm/slub.c:3234 [inline]
kmem_cache_alloc_node+0x255/0x3f0 mm/slub.c:3270
__alloc_skb+0x215/0x340 net/core/skbuff.c:414
alloc_skb include/linux/skbuff.h:1126 [inline]
alloc_skb_with_frags+0x93/0x620 net/core/skbuff.c:6078
sock_alloc_send_pskb+0x783/0x910 net/core/sock.c:2575
mld_newpack+0x1df/0x770 net/ipv6/mcast.c:1754
add_grhead+0x265/0x330 net/ipv6/mcast.c:1857
add_grec+0x1053/0x14e0 net/ipv6/mcast.c:1995
mld_send_initial_cr.part.0+0xf6/0x230 net/ipv6/mcast.c:2242
mld_send_initial_cr net/ipv6/mcast.c:1232 [inline]
mld_dad_work+0x1d3/0x690 net/ipv6/mcast.c:2268
process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
worker_thread+0x658/0x11f0 kernel/workqueue.c:2445
Memory state around the buggy address:
ffff88807f1cb600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88807f1cb680: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff88807f1cb700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88807f1cb780: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
ffff88807f1cb800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes:
|
||
![]() |
e4a7acd6b4 |
tcp: Fix data-races around sysctl_tcp_reflect_tos.
[ Upstream commit 870e3a634b6a6cb1543b359007aca73fe6a03ac5 ]
While reading sysctl_tcp_reflect_tos, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.
Fixes:
|
||
![]() |
6950ee32c1 |
lsm,selinux: pass flowi_common instead of flowi to the LSM hooks
[ Upstream commit 3df98d79215ace13d1e91ddfc5a67a0f5acbd83f ] As pointed out by Herbert in a recent related patch, the LSM hooks do not have the necessary address family information to use the flowi struct safely. As none of the LSMs currently use any of the protocol specific flowi information, replace the flowi pointers with pointers to the address family independent flowi_common struct. Reported-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
![]() |
2fee3cf4c9 |
ipv6: tcp: drop silly ICMPv6 packet too big messages
commit c7bb4b89033b764eb07db4e060548a6311d801ee upstream.
While TCP stack scales reasonably well, there is still one part that
can be used to DDOS it.
IPv6 Packet too big messages have to lookup/insert a new route,
and if abused by attackers, can easily put hosts under high stress,
with many cpus contending on a spinlock while one is stuck in fib6_run_gc()
ip6_protocol_deliver_rcu()
icmpv6_rcv()
icmpv6_notify()
tcp_v6_err()
tcp_v6_mtu_reduced()
inet6_csk_update_pmtu()
ip6_rt_update_pmtu()
__ip6_rt_update_pmtu()
ip6_rt_cache_alloc()
ip6_dst_alloc()
dst_alloc()
ip6_dst_gc()
fib6_run_gc()
spin_lock_bh() ...
Some of our servers have been hit by malicious ICMPv6 packets
trying to _increase_ the MTU/MSS of TCP flows.
We believe these ICMPv6 packets are a result of a bug in one ISP stack,
since they were blindly sent back for _every_ (small) packet sent to them.
These packets are for one TCP flow:
09:24:36.266491 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.266509 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.316688 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.316704 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
09:24:36.608151 IP6 Addr1 > Victim ICMP6, packet too big, mtu 1460, length 1240
TCP stack can filter some silly requests :
1) MTU below IPV6_MIN_MTU can be filtered early in tcp_v6_err()
2) tcp_v6_mtu_reduced() can drop requests trying to increase current MSS.
This tests happen before the IPv6 routing stack is entered, thus
removing the potential contention and route exhaustion.
Note that IPv6 stack was performing these checks, but too late
(ie : after the route has been added, and after the potential
garbage collect war)
v2: fix typo caught by Martin, thanks !
v3: exports tcp_mtu_to_mss(), caught by David, thanks !
Fixes:
|
||
![]() |
d60f07bcb7 |
tcp: annotate data races around tp->mtu_info
commit 561022acb1ce62e50f7a8258687a21b84282a4cb upstream.
While tp->mtu_info is read while socket is owned, the write
sides happen from err handlers (tcp_v[46]_mtu_reduced)
which only own the socket spinlock.
Fixes:
|
||
![]() |
b61d8814c4 |
net: send SYNACK packet with accepted fwmark
commit 43b90bfad34bcb81b8a5bc7dc650800f4be1787e upstream. commit |
||
![]() |
5f64c4c550 |
ipv6: weaken the v4mapped source check
[ Upstream commit dcc32f4f183ab8479041b23a1525d48233df1d43 ] This reverts commit |
||
![]() |
8ef44b6fe4 |
tcp: Retain ECT bits for tos reflection
For DCTCP, we have to retain the ECT bits set by the congestion control
algorithm on the socket when reflecting syn TOS in syn-ack, in order to
make ECN work properly.
Fixes:
|
||
![]() |
407c85c7dd |
tcp: Set ECT0 bit in tos/tclass for synack when BPF needs ECN
When a BPF program is used to select between a type of TCP congestion
control algorithm that uses either ECN or not there is a case where the
synack for the frame was coming up without the ECT0 bit set. A bit of
research found that this was due to the final socket being configured to
dctcp while the listener socket was staying in cubic.
To reproduce it all that is needed is to monitor TCP traffic while running
the sample bpf program "samples/bpf/tcp_cong_kern.c". What is observed,
assuming tcp_dctcp module is loaded or compiled in and the traffic matches
the rules in the sample file, is that for all frames with the exception of
the synack the ECT0 bit is set.
To address that it is necessary to make one additional call to
tcp_bpf_ca_needs_ecn using the request socket and then use the output of
that to set the ECT0 bit for the tos/tclass of the packet.
Fixes:
|
||
![]() |
01770a1661 |
tcp: fix race condition when creating child sockets from syncookies
When the TCP stack is in SYN flood mode, the server child socket is created from the SYN cookie received in a TCP packet with the ACK flag set. The child socket is created when the server receives the first TCP packet with a valid SYN cookie from the client. Usually, this packet corresponds to the final step of the TCP 3-way handshake, the ACK packet. But is also possible to receive a valid SYN cookie from the first TCP data packet sent by the client, and thus create a child socket from that SYN cookie. Since a client socket is ready to send data as soon as it receives the SYN+ACK packet from the server, the client can send the ACK packet (sent by the TCP stack code), and the first data packet (sent by the userspace program) almost at the same time, and thus the server will equally receive the two TCP packets with valid SYN cookies almost at the same instant. When such event happens, the TCP stack code has a race condition that occurs between the momement a lookup is done to the established connections hashtable to check for the existence of a connection for the same client, and the moment that the child socket is added to the established connections hashtable. As a consequence, this race condition can lead to a situation where we add two child sockets to the established connections hashtable and deliver two sockets to the userspace program to the same client. This patch fixes the race condition by checking if an existing child socket exists for the same client when we are adding the second child socket to the established connections socket. If an existing child socket exists, we drop the packet and discard the second child socket to the same client. Signed-off-by: Ricardo Dias <rdias@singlestore.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20201120111133.GA67501@rdias-suse-pc.lan Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
![]() |
861602b577 |
tcp: Allow full IP tos/IPv6 tclass to be reflected in L3 header
An issue was recently found where DCTCP SYN/ACK packets did not have the
ECT bit set in the L3 header. A bit of code review found that the recent
change referenced below had gone though and added a mask that prevented the
ECN bits from being populated in the L3 header.
This patch addresses that by rolling back the mask so that it is only
applied to the flags coming from the incoming TCP request instead of
applying it to the socket tos/tclass field. Doing this the ECT bits were
restored in the SYN/ACK packets in my testing.
One thing that is not addressed by this patch set is the fact that
tcp_reflect_tos appears to be incompatible with ECN based congestion
avoidance algorithms. At a minimum the feature should likely be documented
which it currently isn't.
Fixes:
|
||
![]() |
634a63e73f |
net: ipv6: delete duplicated words
Drop repeated words in net/ipv6/. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
ac8f1710c1 |
tcp: reflect tos value received in SYN to the socket
This commit adds a new TCP feature to reflect the tos value received in SYN, and send it out on the SYN-ACK, and eventually set the tos value of the established socket with this reflected tos value. This provides a way to set the traffic class/QoS level for all traffic in the same connection to be the same as the incoming SYN request. It could be useful in data centers to provide equivalent QoS according to the incoming request. This feature is guarded by /proc/sys/net/ipv4/tcp_reflect_tos, and is by default turned off. Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
e92dd77e6f |
ipv6: add tos reflection in TCP reset and ack
Currently, ipv6 stack does not do any TOS reflection. To make the behavior consistent with v4 stack, this commit adds TOS reflection in tcp_v6_reqsk_send_ack() and tcp_v6_send_reset(). We clear the lower 2-bit ECN value of the received TOS in compliance with RFC 3168 6.1.5 robustness principles. Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
331fca4315 |
bpf: tcp: Add bpf_skops_hdr_opt_len() and bpf_skops_write_hdr_opt()
The bpf prog needs to parse the SYN header to learn what options have been sent by the peer's bpf-prog before writing its options into SYNACK. This patch adds a "syn_skb" arg to tcp_make_synack() and send_synack(). This syn_skb will eventually be made available (as read-only) to the bpf prog. This will be the only SYN packet available to the bpf prog during syncookie. For other regular cases, the bpf prog can also use the saved_syn. When writing options, the bpf prog will first be called to tell the kernel its required number of bytes. It is done by the new bpf_skops_hdr_opt_len(). The bpf prog will only be called when the new BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG is set in tp->bpf_sock_ops_cb_flags. When the bpf prog returns, the kernel will know how many bytes are needed and then update the "*remaining" arg accordingly. 4 byte alignment will be included in the "*remaining" before this function returns. The 4 byte aligned number of bytes will also be stored into the opts->bpf_opt_len. "bpf_opt_len" is a newly added member to the struct tcp_out_options. Then the new bpf_skops_write_hdr_opt() will call the bpf prog to write the header options. The bpf prog is only called if it has reserved spaces before (opts->bpf_opt_len > 0). The bpf prog is the last one getting a chance to reserve header space and writing the header option. These two functions are half implemented to highlight the changes in TCP stack. The actual codes preparing the bpf running context and invoking the bpf prog will be added in the later patch with other necessary bpf pieces. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200820190052.2885316-1-kafai@fb.com |
||
![]() |
d4c19c4914 |
net/tcp: switch ->md5_parse to sockptr_t
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
3021ad5299 |
net/ipv6: remove compat_ipv6_{get,set}sockopt
Handle the few cases that need special treatment in-line using in_compat_syscall(). This also removes all the now unused compat_{get,set}sockopt methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
dd2e0b86fc |
tcp: remove indirect calls for icsk->icsk_af_ops->send_check
Mitigate RETPOLINE costs in __tcp_transmit_skb() by using INDIRECT_CALL_INET() wrapper. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
6abde0b241 |
crypto/chtls: IPv6 support for inline TLS
Extends support to IPv6 for Inline TLS server. Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> v1->v2: - cc'd tcp folks. v2->v3: - changed EXPORT_SYMBOL() to EXPORT_SYMBOL_GPL() Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
d29245692a |
tcp: ipv6: support RFC 6069 (TCP-LD)
Make tcp_ld_RTO_revert() helper available to IPv6, and implement RFC 6069 : Quoting this RFC : 3. Connectivity Disruption Indication For Internet Protocol version 6 (IPv6) [RFC2460], the counterpart of the ICMP destination unreachable message of code 0 (net unreachable) and of code 1 (host unreachable) is the ICMPv6 destination unreachable message of code 0 (no route to destination) [RFC4443]. As with IPv4, a router should generate an ICMPv6 destination unreachable message of code 0 in response to a packet that cannot be delivered to its destination address because it lacks a matching entry in its routing table. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
45af29ca76 |
tcp: allow traceroute -Mtcp for unpriv users
Unpriv users can use traceroute over plain UDP sockets, but not TCP ones. $ traceroute -Mtcp 8.8.8.8 You do not have enough privileges to use this traceroute method. $ traceroute -n -Mudp 8.8.8.8 traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets 1 192.168.86.1 3.631 ms 3.512 ms 3.405 ms 2 10.1.10.1 4.183 ms 4.125 ms 4.072 ms 3 96.120.88.125 20.621 ms 19.462 ms 20.553 ms 4 96.110.177.65 24.271 ms 25.351 ms 25.250 ms 5 69.139.199.197 44.492 ms 43.075 ms 44.346 ms 6 68.86.143.93 27.969 ms 25.184 ms 25.092 ms 7 96.112.146.18 25.323 ms 96.112.146.22 25.583 ms 96.112.146.26 24.502 ms 8 72.14.239.204 24.405 ms 74.125.37.224 16.326 ms 17.194 ms 9 209.85.251.9 18.154 ms 209.85.247.55 14.449 ms 209.85.251.9 26.296 ms^C We can easily support traceroute over TCP, by queueing an error message into socket error queue. Note that applications need to set IP_RECVERR/IPV6_RECVERR option to enable this feature, and that the error message is only queued while in SYN_SNT state. socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 setsockopt(3, SOL_IPV6, IPV6_RECVERR, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_TIMESTAMP_OLD, [1], 4) = 0 setsockopt(3, SOL_IPV6, IPV6_UNICAST_HOPS, [5], 4) = 0 connect(3, {sa_family=AF_INET6, sin6_port=htons(8787), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2002:a05:6608:297::", &sin6_addr), sin6_scope_id=0}, 28) = -1 EHOSTUNREACH (No route to host) recvmsg(3, {msg_name={sa_family=AF_INET6, sin6_port=htons(8787), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2002:a05:6608:297::", &sin6_addr), sin6_scope_id=0}, msg_namelen=1024->28, msg_iov=[{iov_base="`\r\337\320\0004\6\1&\7\370\260\200\231\16\27\0\0\0\0\0\0\0\0 \2\n\5f\10\2\227"..., iov_len=1024}], msg_iovlen=1, msg_control=[{cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=SO_TIMESTAMP_OLD, cmsg_data={tv_sec=1590340680, tv_usec=272424}}, {cmsg_len=60, cmsg_level=SOL_IPV6, cmsg_type=IPV6_RECVERR}], msg_controllen=96, msg_flags=MSG_ERRQUEUE}, MSG_ERRQUEUE) = 144 Suggested-by: Maciej Żenczykowski <maze@google.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
a8eceea84a |
inet: Use fallthrough;
Convert the various uses of fallthrough comments to fallthrough; Done via script Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/ And by hand: net/ipv6/ip6_fib.c has a fallthrough comment outside of an #ifdef block that causes gcc to emit a warning if converted in-place. So move the new fallthrough; inside the containing #ifdef/#endif too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
31484d56ca |
mptcp: Fix undefined mptcp_handle_ipv6_mapped for modular IPV6
If CONFIG_MPTCP=y, CONFIG_MPTCP_IPV6=n, and CONFIG_IPV6=m:
ERROR: "mptcp_handle_ipv6_mapped" [net/ipv6/ipv6.ko] undefined!
This does not happen if CONFIG_MPTCP_IPV6=y, as CONFIG_MPTCP_IPV6
selects CONFIG_IPV6, and thus forces CONFIG_IPV6 builtin.
As exporting a symbol for an empty function would be a bit wasteful, fix
this by providing a dummy version of mptcp_handle_ipv6_mapped() for the
CONFIG_MPTCP_IPV6=n case.
Rename mptcp_handle_ipv6_mapped() to mptcpv6_handle_mapped(), to make it
clear this is a pure-IPV6 function, just like mptcpv6_init().
Fixes:
|
||
![]() |
cec37a6e41 |
mptcp: Handle MP_CAPABLE options for outgoing connections
Add hooks to tcp_output.c to add MP_CAPABLE to an outgoing SYN request, to capture the MP_CAPABLE in the received SYN-ACK, to add MP_CAPABLE to the final ACK of the three-way handshake. Use the .sk_rx_dst_set() handler in the subflow proto to capture when the responding SYN-ACK is received and notify the MPTCP connection layer. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
f870fa0b57 |
mptcp: Add MPTCP socket stubs
Implements the infrastructure for MPTCP sockets. MPTCP sockets open one in-kernel TCP socket per subflow. These subflow sockets are only managed by the MPTCP socket that owns them and are not visible from userspace. This commit allows a userspace program to open an MPTCP socket with: sock = socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); The resulting socket is simply a wrapper around a single regular TCP socket, without any of the MPTCP protocol implemented over the wire. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
35b2c32116 |
tcp: Export TCP functions and ops struct
MPTCP will make use of tcp_send_mss() and tcp_push() when sending data to specific TCP subflows. tcp_request_sock_ipvX_ops and ipvX_specific will be referenced during TCP subflow creation. Co-developed-by: Peter Krystad <peter.krystad@linux.intel.com> Signed-off-by: Peter Krystad <peter.krystad@linux.intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
6b102db50c |
net: Add device index to tcp_md5sig
Add support for userspace to specify a device index to limit the scope of an entry via the TCP_MD5SIG_EXT setsockopt. The existing __tcpm_pad is renamed to tcpm_ifindex and the new field is only checked if the new TCP_MD5SIG_FLAG_IFINDEX is set in tcpm_flags. For now, the device index must point to an L3 master device (e.g., VRF). The API and error handling are setup to allow the constraint to be relaxed in the future to any device index. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
dea53bb80e |
tcp: Add l3index to tcp_md5sig_key and md5 functions
Add l3index to tcp_md5sig_key to represent the L3 domain of a key, and add l3index to tcp_md5_do_add and tcp_md5_do_del to fill in the key. With the key now based on an l3index, add the new parameter to the lookup functions and consider the l3index when looking for a match. The l3index comes from the skb when processing ingress packets leveraging the helpers created for socket lookups, tcp_v4_sdif and inet_iif (and the v6 variants). When the sdif index is set it means the packet ingressed a device that is part of an L3 domain and inet_iif points to the VRF device. For egress, the L3 domain is determined from the socket binding and sk_bound_dev_if. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
d14c77e0b2 |
ipv6/tcp: Pass dif and sdif to tcp_v6_inbound_md5_hash
The original ingress device index is saved to the cb space of the skb and the cb is moved during tcp processing. Since tcp_v6_inbound_md5_hash can be called before and after the cb move, pass dif and sdif to it so the caller can save both prior to the cb move. Both are used by a later patch. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
c4e85f73af |
net: ipv6: add net argument to ip6_dst_lookup_flow
This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow,
as some modules currently pass a net argument without a socket to
ip6_dst_lookup. This is equivalent to commit
|
||
![]() |
288efe8606 |
net: annotate lockless accesses to sk->sk_ack_backlog
sk->sk_ack_backlog can be read without any lock being held. We need to use READ_ONCE()/WRITE_ONCE() to avoid load/store tearing and/or potential KCSAN warnings. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
0f31746452 |
tcp: annotate tp->write_seq lockless reads
There are few places where we fetch tp->write_seq while this field can change from IRQ or other cpu. We need to add READ_ONCE() annotations, and also make sure write sides use corresponding WRITE_ONCE() to avoid store-tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
7db48e9839 |
tcp: annotate tp->copied_seq lockless reads
There are few places where we fetch tp->copied_seq while this field can change from IRQ or other cpu. We need to add READ_ONCE() annotations, and also make sure write sides use corresponding WRITE_ONCE() to avoid store-tearing. Note that tcp_inq_hint() was already using READ_ONCE(tp->copied_seq) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
dba7d9b8c7 |
tcp: annotate tp->rcv_nxt lockless reads
There are few places where we fetch tp->rcv_nxt while this field can change from IRQ or other cpu. We need to add READ_ONCE() annotations, and also make sure write sides use corresponding WRITE_ONCE() to avoid store-tearing. Note that tcp_inq_hint() was already using READ_ONCE(tp->rcv_nxt) syzbot reported : BUG: KCSAN: data-race in tcp_poll / tcp_queue_rcv write to 0xffff888120425770 of 4 bytes by interrupt on cpu 0: tcp_rcv_nxt_update net/ipv4/tcp_input.c:3365 [inline] tcp_queue_rcv+0x180/0x380 net/ipv4/tcp_input.c:4638 tcp_rcv_established+0xbf1/0xf50 net/ipv4/tcp_input.c:5616 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1542 tcp_v4_rcv+0x1a03/0x1bf0 net/ipv4/tcp_ipv4.c:1923 ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208 napi_skb_finish net/core/dev.c:5671 [inline] napi_gro_receive+0x28f/0x330 net/core/dev.c:5704 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061 read to 0xffff888120425770 of 4 bytes by task 7254 on cpu 1: tcp_stream_is_readable net/ipv4/tcp.c:480 [inline] tcp_poll+0x204/0x6b0 net/ipv4/tcp.c:554 sock_poll+0xed/0x250 net/socket.c:1256 vfs_poll include/linux/poll.h:90 [inline] ep_item_poll.isra.0+0x90/0x190 fs/eventpoll.c:892 ep_send_events_proc+0x113/0x5c0 fs/eventpoll.c:1749 ep_scan_ready_list.constprop.0+0x189/0x500 fs/eventpoll.c:704 ep_send_events fs/eventpoll.c:1793 [inline] ep_poll+0xe3/0x900 fs/eventpoll.c:1930 do_epoll_wait+0x162/0x180 fs/eventpoll.c:2294 __do_sys_epoll_pwait fs/eventpoll.c:2325 [inline] __se_sys_epoll_pwait fs/eventpoll.c:2311 [inline] __x64_sys_epoll_pwait+0xcd/0x170 fs/eventpoll.c:2311 do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 7254 Comm: syz-fuzzer Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
d983ea6f16 |
tcp: add rcu protection around tp->fastopen_rsk
Both tcp_v4_err() and tcp_v6_err() do the following operations
while they do not own the socket lock :
fastopen = tp->fastopen_rsk;
snd_una = fastopen ? tcp_rsk(fastopen)->snt_isn : tp->snd_una;
The problem is that without appropriate barrier, the compiler
might reload tp->fastopen_rsk and trigger a NULL deref.
request sockets are protected by RCU, we can simply add
the missing annotations and barriers to solve the issue.
Fixes:
|
||
![]() |
f6c0f5d209 |
tcp: honor SO_PRIORITY in TIME_WAIT state
ctl packets sent on behalf of TIME_WAIT sockets currently have a zero skb->priority, which can cause various problems. In this patch we : - add a tw_priority field in struct inet_timewait_sock. - populate it from sk->sk_priority when a TIME_WAIT is created. - For IPv4, change ip_send_unicast_reply() and its two callers to propagate tw_priority correctly. ip_send_unicast_reply() no longer changes sk->sk_priority. - For IPv6, make sure TIME_WAIT sockets pass their tw_priority field to tcp_v6_send_response() and tcp_v6_send_ack(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
e9a5dceee5 |
ipv6: tcp: provide sk->sk_priority to ctl packets
We can populate skb->priority for some ctl packets instead of always using zero. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
4f6570d720 |
ipv6: add priority parameter to ip6_xmit()
Currently, ip6_xmit() sets skb->priority based on sk->sk_priority This is not desirable for TCP since TCP shares the same ctl socket for a given netns. We want to be able to send RST or ACK packets with a non zero skb->priority. This patch has no functional change. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
9349d600fb |
tcp: add skb-less helpers to retrieve SYN cookie
This patch allows generation of a SYN cookie before an SKB has been allocated, as is the case at XDP. Signed-off-by: Petar Penkov <ppenkov@google.com> Reviewed-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |