154900b51bc320a4361dfa0de0302e7056cd6a44
94 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
43dee37fa9 |
Merge remote-tracking branch 'remotes/origin/tmp-0453f563cce0' into msm-5.4
* remotes/origin/tmp-0453f563cce0:
FROMLIST: pwm: Convert period and duty cycle to u64
ANDROID: gki_defconfig: FW_CACHE to no
FROMGIT: firmware_class: make firmware caching configurable
ANDROID: gki_defconfig: removed CONFIG_PM_WAKELOCKS
ANDROID: gki_defconfig: enable CONFIG_IKHEADERS as m
ANDROID: update ABI representation
FROMLIST: reboot: Export reboot_mode
FROMLIST: iommu/arm-smmu: Update my email address in MODULE_AUTHOR()
FROMLIST: iommu/arm-smmu: Allow building as a module
FROMLIST: iommu/arm-smmu: Unregister IOMMU and bus ops on device removal
FROMLIST: iommu/arm-smmu-v3: Allow building as a module
FROMLIST: iommu/arm-smmu-v3: Unregister IOMMU and bus ops on device removal
FROMLIST: iommu/arm-smmu: Prevent forced unbinding of Arm SMMU drivers
FROMLIST: Revert "iommu/arm-smmu: Make arm-smmu explicitly non-modular"
FROMLIST: Revert "iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modular"
FROMLIST: drivers/iommu: Allow IOMMU bus ops to be unregistered
FROMLIST: iommu/of: Take a ref to the IOMMU driver during ->of_xlate()
FROMLIST: drivers/iommu: Take a ref to the IOMMU driver prior to ->add_device()
FROMLIST: PCI: Export pci_ats_disabled() as a GPL symbol to modules
FROMLIST: iommu/of: Request ACS from the PCI core when configuring IOMMU linkage
FROMLIST: drivers/iommu: Export core IOMMU API symbols to permit modular drivers
FROMGIT: of: property: Add device link support for "iommu-map"
Revert "FROMLIST: iommu: Export core IOMMU functions to kernel modules"
Revert "FROMLIST: PCI: Export PCI ACS and DMA searching functions to modules"
Revert "FROMLIST: of: Export of_phandle_iterator_args() to modules"
ANDROID: initial branch preparations for 5.4
Linux 5.4
cramfs: fix usage on non-MTD device
Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"
afs: Fix large file support
afs: Fix possible assert with callbacks from yfs servers
r8152: avoid to call napi_disable twice
MAINTAINERS: Add myself as maintainer of virtio-vsock
udp: drop skb extensions before marking skb stateless
net: rtnetlink: prevent underflows in do_setvfinfo()
mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()
Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
can: m_can_platform: remove unnecessary m_can_class_resume() call
can: m_can_platform: set net_device structure as driver data
hv_netvsc: Fix send_table offset in case of a host bug
hv_netvsc: Fix offset usage in netvsc_send_table()
net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN
ANDROID: gki_defconfig: disable FUNCTION_TRACER
sfc: Only cancel the PPS workqueue if it exists
nfc: port100: handle command failure cleanly
drm/i915/fbdev: Restore physical addresses for fb_mmap()
net-sysfs: fix netdev_queue_add_kobject() breakage
ANDROID: update ABI representation
ANDROID: add unstripped modules to the distribution
Revert "drm/amd/display: enable S/G for RAVEN chip"
drm/amdgpu: disable gfxoff on original raven
drm/amdgpu: disable gfxoff when using register read interface
drm/amd/powerplay: correct fine grained dpm force level setting
drm/amd/powerplay: issue no PPSMC_MSG_GetCurrPkgPwr on unsupported ASICs
drm/amdgpu: remove experimental flag for Navi14
r8152: Re-order napi_disable in rtl8152_close
net: qca_spi: Move reset_count to struct qcaspi
net: qca_spi: fix receive buffer size check
net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode
Revert "net/ibmvnic: Fix EOI when running in XIVE mode"
net/mlxfw: Verify FSM error code translation doesn't exceed array size
net/mlx5: Update the list of the PCI supported devices
net/mlx5: Fix auto group size calculation
net/mlx5e: Add missing capability bit check for IP-in-IP
net/mlx5e: Do not use non-EXT link modes in EXT mode
net/mlx5e: Fix set vf link state error flow
net/mlx5: DR, Limit STE hash table enlarge based on bytemask
net/mlx5: DR, Skip rehash for tables with byte mask zero
net/mlx5: DR, Fix invalid EQ vector number on CQ creation
net/mlx5e: Reorder mirrer action parsing to check for encap first
net/mlx5e: Fix ingress rate configuration for representors
net/mlx5e: Fix error flow cleanup in mlx5e_tc_tun_create_header_ipv4/6
s390/qeth: return proper errno on IO error
s390/qeth: fix potential deadlock on workqueue flush
ipv6/route: return if there is no fib_nh_gw_family
net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject
arm64: uaccess: Remove uaccess_*_not_uao asm macros
arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault
BACKPORT: FROMLIST: pwm: Add support for different PWM output types
ANDROID: SoC: core: Introduce macro SOC_SINGLE_MULTI_EXT
ANDROID: usb: gadget: configfs: fix compiler warning
fork: fix pidfd_poll()'s return type
PM: QoS: Invalidate frequency QoS requests after removal
virtio_balloon: fix shrinker count
virtio_balloon: fix shrinker scan number of pages
mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=n
Revert "mdio_bus: fix mdio_register_device when RESET_CONTROLLER is disabled"
net: hns3: fix a wrong reset interrupt status mask
net: fec: fix clock count mis-match
net/sched: act_pedit: fix WARN() in the traffic path
net: phylink: fix link mode modification in PHY mode
net: phylink: update documentation on create and destroy
r8169: disable TSO on a single version of RTL8168c to fix performance
MAINTAINERS: forcedeth: Change Zhu Yanjun's email address
taprio: don't reject same mqprio settings
net/tls: enable sk_msg redirect to tls socket egress
afs: Fix missing timeout reset
ANDROID: drivers: gpu: drm: add support to batch commands
gve: fix dma sync bug where not all pages synced
drm/i915: make pool objects read-only
mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=n
nbd:fix memory leak in nbd_get_socket()
virtio_console: allocate inbufs in add_port() only if it is needed
virtio_ring: fix return code on DMA mapping fails
mdio_bus: fix mdio_register_device when RESET_CONTROLLER is disabled
net/ipv4: fix sysctl max for fib_multipath_hash_policy
phy: mdio-sun4i: add missed regulator_disable in remove
net/mlx4_en: Fix wrong limitation for number of TX rings
net: sched: ensure opts_len <= IP_TUNNEL_OPTS_MAX in act_tunnel_key
mlxsw: spectrum_router: Fix determining underlay for a GRE tunnel
net: atm: Reduce the severity of logging in unlink_clip_vcc
ANDROID: virtio_blk: Remove BUG_ON for discard/zero ops
drm/i915: Protect request peeking with RCU
drm/i915/userptr: Try to acquire the page lock around set_page_dirty()
drm/i915/pmu: "Frequency" is reported as accumulated cycles
drm/i915: Preload LUTs if the hw isn't currently using them
drm/i915: Don't oops in dumb_create ioctl if we have no crtcs
ANDROID: usb: gadget: configfs: Support multiple android instances
Linux 5.4-rc8
net/mlx4_en: fix mlx4 ethtool -N insertion
Revert "hwrng: core - Freeze khwrng thread during suspend"
ipmr: Fix skb headroom in ipmr_get_route().
net: hns3: cleanup of stray struct hns3_link_mode_mapping
net/smc: fix fastopen for non-blocking connect()
rds: ib: update WR sizes when bringing up connection
net: gemini: add missed free_netdev
net: dsa: tag_8021q: Fix dsa_8021q_restore_pvid for an absent pvid
seg6: fix skb transport_header after decap_and_validate()
seg6: fix srh pointer in get_srh()
net: stmmac: Use the correct style for SPDX License Identifier
octeontx2-af: Use the correct style for SPDX License Identifier
mm/debug.c: PageAnon() is true for PageKsm() pages
mm/debug.c: __dump_page() prints an extra line
mm/page_io.c: do not free shared swap slots
mm/memory_hotplug: fix try_offline_node()
mm,thp: recheck each page before collapsing file THP
mm: slub: really fix slab walking for init_on_free
mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()
mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
lib/xz: fix XZ_DYNALLOC to avoid useless memory reallocations
mm: fix trying to reclaim unevictable lru page when calling madvise_pageout
mm: mempolicy: fix the wrong return value and potential pages leak of mbind
ANDROID: ion: Fix buffer_lock mutex initialization
Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation
i2c: core: fix use after free in of_i2c_notify
i2c: acpi: Force bus speed to 400KHz if a Silead touchscreen is present
ptp: Extend the test program to check the external time stamp flags.
mlx5: Reject requests to enable time stamping on both edges.
igb: Reject requests that fail to enable time stamping on both edges.
dp83640: Reject requests to enable time stamping on both edges.
mv88e6xxx: Reject requests to enable time stamping on both edges.
ptp: Introduce strict checking of external time stamp options.
renesas: reject unsupported external timestamp flags
mlx5: reject unsupported external timestamp flags
igb: reject unsupported external timestamp flags
dp83640: reject unsupported external timestamp flags
mv88e6xxx: reject unsupported external timestamp flags
net: reject PTP periodic output requests with unsupported flags
ptp: Validate requests to enable time stamping of external signals.
net: ep93xx_eth: fix mismatch of request_mem_region in remove
ax88172a: fix information leak on short answers
selftests: mlxsw: Adjust test to recent changes
Input: synaptics-rmi4 - destroy F54 poller workqueue when removing
Input: ff-memless - kill timer in destroy()
FROMGIT: driver core: Allow device link operations inside sync_state()
afs: Fix race in commit bulk status fetch
Revert "ANDROID: Revert "Merge tag 'modules-for-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux""
sched/uclamp: Fix incorrect condition
KVM: Add a comment describing the /dev/kvm no_compat handling
net: hns3: fix ETS bandwidth validation bug
net: hns3: reallocate SSU' buffer size when pfc_en changes
net: hns3: add compatible handling for MAC VLAN switch parameter configuration
ravb: implement MTU change while device is up
tipc: add back tipc prefix to log messages
ANDROID: uid_sys_stats: avoid double accounting of dying threads
ANDROID: drm: edid: add support for additional CEA extension blocks
ANDROID: include: uapi: drm: add additional drm mode flags
ANDROID: drivers: gpu: drm: fix bugs encountered while fuzzing
ANDROID: driver: gpu: drm: add notifier for panel related events
ANDROID: include: drm: add the definitions for DP Link Compliance tests
ANDROID: drm: dsi: add two DSI mode flags for BLLP
ANDROID: include: drm: support unicasting mipi cmds to dsi ctrls
ANDROID: include: drm: increase DRM max property count to 64
ANDROID: drivers: gpu: drm: add support for secure framebuffer
ANDROID: include: uapi: drm: add additional QCOM modifiers
ANDROID: build kernels with llvm-nm and llvm-objcopy
drm/amdgpu: fix null pointer deref in firmware header printing
rsxx: add missed destroy_workqueue calls in remove
iocost: check active_list of all the ancestors in iocg_activate()
rbd: silence bogus uninitialized warning in rbd_object_map_update_finish()
ceph: increment/decrement dio counter on async requests
ceph: take the inode lock before acquiring cap refs
ALSA: usb-audio: Fix incorrect size check for processing/extension units
KVM: x86/mmu: Take slots_lock when using kvm_mmu_zap_all_fast()
kbuild: tell sparse about the $ARCH
sparc: vdso: fix build error of vdso32
block, bfq: deschedule empty bfq_queues not referred by any process
mmc: sdhci-of-at91: fix quirk2 overwrite
ALSA: usb-audio: Fix incorrect NULL check in create_yamaha_midi_quirk()
slcan: Fix memory leak in error path
io_uring: ensure registered buffer import returns the IO length
net: cdc_ncm: Signedness bug in cdc_ncm_set_dgram_size()
io_uring: Fix getting file for timeout
ANDROID: scsi: ufs-qcom: Enable BROKEN_CRYPTO quirk flag
ANDROID: scsi: ufs-hisi: Enable BROKEN_CRYPTO quirk flag
ANDROID: scsi: ufs: Add quirk bit for controllers that don't play well with inline crypto
ANDROID: scsi: ufs: UFS init should not require inline crypto
gpiolib: acpi: Add Terra Pad 1061 to the run_edge_events_on_boot_blacklist
drm/i915/tgl: MOCS table update
Revert "drm/i915/ehl: Update MOCS table for EHL"
slip: Fix memory leak in slip_open error path
net: usb: qmi_wwan: add support for Foxconn T77W968 LTE modules
KVM: Forbid /dev/kvm being opened by a compat task when CONFIG_KVM_COMPAT=n
KVM: X86: Reset the three MSR list number variables to 0 in kvm_init_msr_list()
selftests: kvm: fix build with glibc >= 2.30
kvm: x86: disable shattered huge page recovery for PREEMPT_RT.
drm/sun4i: tcon: Set min division of TCON0_DCLK to 1.
tools: gpio: Correctly add make dependencies for gpio_utils
x86/resctrl: Fix potential lockdep warning
scripts/tools-support-relr.sh: un-quote variables
Revert "ANDROID: Revert "Merge tag 'modules-for-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux""
ALSA: pcm: Fix stream lock usage in snd_pcm_period_elapsed()
ALSA: usb-audio: not submit urb for stopped endpoint
can: j1939: warn if resources are still linked on destroy
can: j1939: j1939_can_recv(): add priv refcounting
can: j1939: transport: j1939_cancel_active_session(): use hrtimer_try_to_cancel() instead of hrtimer_cancel()
can: j1939: make sure socket is held as long as session exists
can: j1939: transport: make sure the aborted session will be deactivated only once
can: j1939: socket: rework socket locking for j1939_sk_release() and j1939_sk_sendmsg()
can: j1939: main: j1939_ndev_to_priv(): avoid crash if can_ml_priv is NULL
can: j1939: move j1939_priv_put() into sk_destruct callback
can: af_can: export can_sock_destruct()
perf/core: Fix missing static inline on perf_cgroup_switch()
perf/core: Consistently fail fork on allocation failures
perf/aux: Disallow aux_output for kernel events
perf/core: Reattach a misplaced comment
perf/aux: Fix the aux_output group inheritance fix
perf/core: Disallow uncore-cgroup events
sched/pelt: Fix update of blocked PELT ordering
sched/core: Avoid spurious lock dependencies
dpaa2-eth: free already allocated channels on probe defer
Input: cyttsp4_core - fix use after free bug
Input: synaptics-rmi4 - clear IRQ enables for F54
Remove VirtualBox guest shared folders filesystem
net/smc: fix refcount non-blocking connect() -part 2
drm/i915: update rawclk also on resume
x86/quirks: Disable HPET on Intel Coffe Lake platforms
block: check bi_size overflow before merge
gpio: bd70528: Use correct unit for debounce times
gpio: max77620: Fixup debounce delays
KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved
KVM: VMX: Introduce pi_is_pir_empty() helper
KVM: VMX: Do not change PID.NDST when loading a blocked vCPU
KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts
KVM: VMX: Fix comment to specify PID.ON instead of PIR.ON
KVM: X86: Fix initialization of MSR lists
xfrm: release device reference for invalid state
io_uring: make timeout sequence == 0 mean no sequence
ntp/y2038: Remove incorrect time_t truncation
mdio_bus: Fix PTR_ERR applied after initialization to constant
NFC: nxp-nci: Fix NULL pointer dereference after I2C communication error
mlxsw: core: Enable devlink reload only on probe
FROMGIT: of: property: Add device link support for iommus, mboxes and io-channels
FROMGIT: of: property: Make it easy to add device links from DT properties
FROMGIT: of: property: Minor style clean up of of_link_to_phandle()
Revert "ANDROID: of/property: Add device link support for iommus"
devlink: Add method for time-stamp on reporter's dump
net: ethernet: dwmac-sun8i: Use the correct function in exit path
Btrfs: fix log context list corruption after rename exchange operation
drm/i915/cmdparser: Fix jump whitelist clearing
iommu/vt-d: Fix QI_DEV_IOTLB_PFSID and QI_DEV_EIOTLB_PFSID macros
MAINTAINERS: Update for INTEL IOMMU (VT-d) entry
KVM: fix placement of refcount initialization
KVM: Fix NULL-ptr deref after kvm_create_vm fails
ALSA: hda: hdmi - fix pin setup on Tigerlake
ALSA: hda: Add Cometlake-S PCI ID
Revert "Revert "sched: Rework pick_next_task() slow-path""
Linux 5.4-rc7
lib: Remove select of inexistant GENERIC_IO
ecryptfs_lookup_interpose(): lower_dentry->d_parent is not stable either
ecryptfs_lookup_interpose(): lower_dentry->d_inode is not stable
ecryptfs: fix unlink and rmdir in face of underlying fs modifications
audit_get_nd(): don't unlock parent too early
exportfs_decode_fh(): negative pinned may become positive without the parent locked
cgroup: don't put ERR_PTR() into fc->root
tcp: remove redundant new line from tcp_event_sk_skb
devlink: disallow reload operation during device cleanup
ALSA: usb-audio: Fix missing error check at mixer resolution test
ixgbe: need_wakeup flag might not be set for Tx
i40e: need_wakeup flag might not be set for Tx
igb/igc: use ktime accessors for skb->tstamp
i40e: Fix for ethtool -m issue on X722 NIC
iavf: initialize ITRN registers with correct values
ice: fix potential infinite loop because loop counter being too small
qede: fix NULL pointer deref in __qede_remove()
net: fix data-race in neigh_event_send()
sched: Fix pick_next_task() vs 'change' pattern race
sched/core: Fix compilation error when cgroup not selected
cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead
vsock/virtio: fix sock refcnt holding during the shutdown
iwlwifi: pcie: don't consider IV len in A-MSDU
net: ethernet: octeon_mgmt: Account for second possible VLAN header
pwm: bcm-iproc: Prevent unloading the driver module while in use
block: drbd: remove a stray unlock in __drbd_send_protocol()
cpufreq: intel_pstate: Fix invalid EPB setting
mac80211: fix station inactive_time shortly after boot
net/fq_impl: Switch to kvmalloc() for memory allocation
mac80211: fix ieee80211_txq_setup_flows() failure path
drm/i915/gvt: fix dropping obj reference twice
ANDROID: Kconfig.gki: Add SND_SOC_COMPRESS
ipv4: Fix table id reference in fib_sync_down_addr
ipv6: fixes rt6_probe() and fib6_nh->last_probe init
net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
Revert "ANDROID: Kconfig.gki: Add SND_SOC_COMPRESS"
net: usb: qmi_wwan: add support for DW5821e with eSIM support
CDC-NCM: handle incomplete transfer of MTU
nfc: netlink: fix double device reference drop
ANDROID: Kconfig.gki: Add SND_SOC_COMPRESS
ceph: return -EINVAL if given fsc mount option on kernel w/o support
staging: Fix error return code in vboxsf_fill_super()
staging: vboxsf: fix dereference of pointer dentry before it is null checked
staging: vboxsf: Remove unused including <linux/version.h>
x86/speculation/taa: Fix printing of TAA_MSG_SMT on IBRS_ALL CPUs
xfrm: Fix memleak on xfrm state destroy
pinctrl: stmfx: fix valid_mask init sequence
NFC: st21nfca: fix double free
net: hns3: add compatible handling for command HCLGE_OPC_PF_RST_DONE
r8169: fix page read in r8168g_mdio_read
net: stmmac: Fix the TX IOC in xmit path
net: stmmac: Fix TSO descriptor with Enhanced Addressing
net: stmmac: Fix the packet count in stmmac_rx()
net: stmmac: xgmac: Disable MMC interrupts by default
net: stmmac: xgmac: Disable Flow Control when 1 or more queues are in AV
net: stmmac: xgmac: Fix AV Feature detection
net: stmmac: xgmac: Fix TSA selection
net: stmmac: xgmac: Only get SPH header len if available
net: stmmac: selftests: Prevent false positives in filter tests
net: stmmac: xgmac: bitrev32 returns u32
net: stmmac: gmac4: bitrev32 returns u32
SMB3: Fix persistent handles reconnect
drm/radeon: fix si_enable_smc_cac() failed issue
drm/amdgpu/renoir: move gfxoff handling into gfx9 module
drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9
drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10
drm/amdgpu: register gpu instance before fan boost feature enablment
drm/amd/swSMU: fix smu workload bit map error
net/smc: fix ethernet interface refcounting
selftests/tls: add test for concurrent recv and send
net/tls: add a TX lock
net/tls: don't pay attention to sk_write_pending when pushing partial records
blkcg: make blkcg_print_stat() print stats only for online blkgs
drm/shmem: Add docbook comments for drm_gem_shmem_object madvise fields
net: mscc: ocelot: fix __ocelot_rmw_ix prototype
bpf, offload: Unlock on error in bpf_offload_dev_create()
net: mscc: ocelot: fix NULL pointer on LAG slave removal
net: mscc: ocelot: don't handle netdev events for other netdevs
net/mlx5e: Use correct enum to determine uplink port
net/mlx5: DR, Fix memory leak during rule creation
net/mlx5: DR, Fix memory leak in modify action destroy
net/mlx5e: Fix eswitch debug print of max fdb flow
HID: wacom: generic: Treat serial number and related fields as unsigned
drm/amdgpu: add navi14 PCI ID
Revert "drm/amd/display: setting the DIG_MODE to the correct value."
drm/amd/display: Add ENGINE_ID_DIGD condition check for Navi14
drm/amdgpu: dont schedule jobs while in reset
drm/amdgpu/arcturus: properly set BANK_SELECT and FRAGMENT_SIZE
arm64: Do not mask out PTE_RDONLY in pte_same()
ANDROID: nf: IDLETIMER: Fix possible use before initialization in idletimer_resume
ANDROID: gki_defconfig: enable FW_LOADER configs
net: bcmgenet: reapply manual settings to the PHY
Revert "net: bcmgenet: soft reset 40nm EPHYs before MAC init"
net: bcmgenet: use RGMII loopback for MAC reset
drm/atomic: fix self-refresh helpers crtc state dereference
RDMA/hns: Correct the value of srq_desc_size
RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LEN
configfs: calculate the depth of parent item
IB/hfi1: TID RDMA WRITE should not return IB_WC_RNR_RETRY_EXC_ERR
IB/hfi1: Calculate flow weight based on QP MTU for TID RDMA
IB/hfi1: Ensure r_tid_ack is valid before building TID RDMA ACK packet
IB/hfi1: Ensure full Gen3 speed in a Gen4 system
ALSA: timer: Fix incorrectly assigned timer instance
mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges
mm/memory_hotplug: fix updating the node span
scripts/gdb: fix debugging modules compiled with hot/cold partitioning
mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly
MAINTAINERS: update information for "MEMORY MANAGEMENT"
dump_stack: avoid the livelock of the dump_lock
zswap: add Vitaly to the maintainers list
mm/page_alloc.c: ratelimit allocation failure warnings more aggressively
mm/khugepaged: fix might_sleep() warn with CONFIG_HIGHPTE=y
mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo
mm, vmstat: hide /proc/pagetypeinfo from normal users
mm/mmu_notifiers: use the right return code for WARN_ON
ocfs2: protect extent tree in ocfs2_prepare_inode_for_write()
mm: thp: handle page cache THP correctly in PageTransCompoundMap
mm, meminit: recalculate pcpu batch and high limits after init completes
mm/gup_benchmark: fix MAP_HUGETLB case
mm: memcontrol: fix NULL-ptr deref in percpu stats flush
MAINTAINERS: update Cavium ThunderX2 maintainers
ASoC: SOF: topology: Fix bytes control size checks
ARM: dts: stm32: change joystick pinctrl definition on stm32mp157c-ev1
ARM: dts: stm32: remove OV5640 pinctrl definition on stm32mp157c-ev1
ARM: dts: stm32: Fix CAN RAM mapping on stm32mp157c
ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157
scsi: core: Handle drivers which set sg_tablesize to zero
scsi: qla2xxx: fix NPIV tear down process
scsi: sd_zbc: Fix sd_zbc_complete()
Documentation: TLS: Add missing counter description
NFC: fdp: fix incorrect free object
net: prevent load/store tearing on sk->sk_stamp
net: qualcomm: rmnet: Fix potential UAF when unregistering
net/tls: fix sk_msg trim on fallback to copy mode
mlx4_core: fix wrong comment about the reason of subtract one from the max_cqes
net: dsa: bcm_sf2: Fix driver removal
net: sched: prevent duplicate flower rules from tcf_proto destroy race
net: hns3: Use the correct style for SPDX License Identifier
bonding: fix state transition issue in link monitoring
taprio: fix panic while hw offload sched list swap
FROMLIST: overlayfs: override_creds=off option bypass creator_cred
FROMLIST: overlayfs: internal getxattr operations without sepolicy checking
FROMLIST: overlayfs: handle XATTR_NOSECURITY flag for get xattr method
FROMLIST: Add flags option to get xattr method paired to __vfs_getxattr
FROMLIST: afs: xattr: use scnprintf
nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
drm/i915/gen8+: Add RC6 CTX corruption WA
drm/i915: Lower RM timeout to avoid DSI hard hangs
drm/i915/cmdparser: Ignore Length operands during command matching
drm/i915/cmdparser: Add support for backward jumps
drm/i915/cmdparser: Use explicit goto for error paths
drm/i915: Add gen9 BCS cmdparsing
drm/i915: Allow parsing of unsized batches
drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
drm/i915: Add support for mandatory cmdparsing
drm/i915: Remove Master tables from cmdparser
drm/i915: Disable Secure Batches for gen6+
drm/i915: Rename gen7 cmdparser tables
ALSA: hda: hdmi - add Tigerlake support
ASoC: max98373: replace gpio_request with devm_gpio_request
ASoC: stm32: sai: add restriction on mmap support
watchdog: bd70528: Add MODULE_ALIAS to allow module auto loading
watchdog: imx_sc_wdt: Pretimeout should follow SCU firmware format
watchdog: meson: Fix the wrong value of left time
watchdog: pm8916_wdt: fix pretimeout registration flow
watchdog: cpwd: fix build regression
nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
nvme-rdma: fix a segmentation fault during module unload
clone3: validate stack arguments
ceph: don't allow copy_file_range when stripe_count != 1
ceph: don't try to handle hashed dentries in non-O_CREAT atomic_open
ALSA: hda/ca0132 - Fix possible workqueue stall
scripts/nsdeps: make sure to pass all module source files to spatch
perf tools: Fix time sorting
can: don't use deprecated license identifiers
can: mcp251x: mcp251x_restart_work_handler(): Fix potential force_quit race condition
perf tools: Remove unused trace_find_next_event()
perf scripting engines: Iterate on tep event arrays directly
FROMGIT: of: property: Skip adding device links to suppliers that aren't devices
x86/tsc: Respect tsc command line paraemeter for clocksource_tsc_early
Input: synaptics-rmi4 - remove unused result_bits mask
Input: synaptics-rmi4 - do not consume more data than we have (F11, F12)
Input: synaptics-rmi4 - disable the relative position IRQ in the F12 driver
Input: synaptics-rmi4 - fix video buffer size
x86/dumpstack/64: Don't evaluate exception stacks before setup
irq/irqdomain: Update __irq_domain_alloc_fwnode() function documentation
x86/apic/32: Avoid bogus LDR warnings
timekeeping/vsyscall: Update VDSO data unconditionally
drm/i915/dp: Do not switch aux to TBT mode for non-TC ports
drm/i915: Avoid HPD poll detect triggering a new detect cycle
can: j1939: transport: j1939_xtp_rx_eoma_one(): Add sanity check for correct total message size
can: j1939: transport: j1939_session_fresh_new(): make sure EOMA is send with the total message size set
can: j1939: fix memory leak if filters was set
can: j1939: fix resource leak of skb on error return paths
can: ti_hecc: add missing state changes
can: ti_hecc: properly report state changes
can: ti_hecc: add fifo overflow error reporting
can: ti_hecc: release the mailbox a bit earlier
can: ti_hecc: keep MIM and MD set
can: ti_hecc: ti_hecc_stop(): stop the CPK on down
can: ti_hecc: ti_hecc_error(): increase error counters if skb enqueueing via can_rx_offload_queue_sorted() fails
can: flexcan: increase error counters if skb enqueueing via can_rx_offload_queue_sorted() fails
can: rx-offload: can_rx_offload_irq_offload_fifo(): continue on error
can: rx-offload: can_rx_offload_irq_offload_timestamp(): continue on error
can: rx-offload: can_rx_offload_offload_one(): use ERR_PTR() to propagate error value in case of errors
can: rx-offload: can_rx_offload_offload_one(): increment rx_fifo_errors on queue overflow or OOM
can: rx-offload: can_rx_offload_offload_one(): do not increase the skb_queue beyond skb_queue_len_max
can: rx-offload: can_rx_offload_queue_tail(): fix error handling, avoid skb mem leak
can: rx-offload: can_rx_offload_queue_sorted(): fix error handling, avoid skb mem leak
can: xilinx_can: Fix flags field initialization for axi can
can: c_can: C_CAN: add bus recovery events
can: c_can: D_CAN: c_can_chip_config(): perform a sofware reset on open
can: c_can: c_can_poll(): only read status register after status IRQ
can: peak_usb: report bus recovery as well
can: peak_usb: fix slab info leak
can: peak_usb: fix a potential out-of-sync while decoding packets
can: flexcan: disable completely the ECC mechanism
can: usb_8dev: fix use-after-free on disconnect
can: mcba_usb: fix use-after-free on disconnect
can: gs_usb: gs_can_open(): prevent memory leak
can: dev: add missing of_node_put() after calling of_get_child_by_name()
btrfs: un-deprecate ioctls START_SYNC and WAIT_SYNC
btrfs: save i_size to avoid double evaluation of i_size_read in compress_file_range
stacktrace: Don't skip first entry on noncurrent tasks
netfilter: nf_tables_offload: skip EBUSY on chain update
netfilter: nf_tables: bogus EOPNOTSUPP on basechain update
bridge: ebtables: don't crash when using dnat target in output chains
netfilter: nf_tables: fix unexpected EOPNOTSUPP error
netfilter: nf_tables: Align nft_expr private data to 64-bit
netfilter: ipset: Fix nla_policies to fully support NL_VALIDATE_STRICT
netfilter: ipset: Copy the right MAC address in hash:ip,mac IPv6 sets
netfilter: ipset: Fix an error code in ip_set_sockfn_get()
dccp: do not leak jiffies on the wire
net: fec: add missed clk_disable_unprepare in remove
Documentation: Add ITLB_MULTIHIT documentation
kvm: x86: mmu: Recovery of shattered NX large pages
MAINTAINERS: Remove Kevin as maintainer of BMIPS generic platforms
clk: ti: clkctrl: Fix failed to enable error with double udelay timeout
clk: ti: dra7-atl-clock: Remove ti_clk_add_alias call
netfilter: nf_tables_offload: check for register data length mismatches
intel_th: pci: Add Jasper Lake PCH support
intel_th: pci: Add Comet Lake PCH support
intel_th: msu: Fix possible memory leak in mode_store()
intel_th: msu: Fix overflow in shift of an unsigned int
intel_th: msu: Fix missing allocation failure check on a kstrndup
intel_th: msu: Fix an uninitialized mutex
intel_th: gth: Fix the window switching sequence
ASoC: hdac_hda: fix race in device removal
kvm: Add helper function for creating VM worker threads
kvm: mmu: ITLB_MULTIHIT mitigation
cpu/speculation: Uninline and export CPU mitigations helpers
x86/cpu: Add Tremont to the cpu vulnerability whitelist
x86/bugs: Add ITLB_MULTIHIT bug infrastructure
fbdev: c2p: Fix link failure on non-inlining
ALSA: bebob: fix to detect configured source of sampling clock for Focusrite Saffire Pro i/o series
arm64: dts: zii-ultra: fix ARM regulator GPIO handle
Revert "gpio: merrifield: Pass irqchip when adding gpiochip"
Revert "gpio: merrifield: Restore use of irq_base"
Revert "gpio: merrifield: Move hardware initialization to callback"
x86/resctrl: Prevent NULL pointer dereference when reading mondata
idr: Fix idr_alloc_u32 on 32-bit systems
idr: Fix integer overflow in idr_for_each_entry
HID: i2c-hid: Send power-on command after reset
radix tree: Remove radix_tree_iter_find
idr: Fix idr_get_next_ul race with idr_remove
powerpc/bpf: Fix tail call implementation
MIPS: SGI-IP27: fix exception handler replication
bpf: Change size to u64 for bpf_map_{area_alloc, charge_init}()
samples/bpf: fix build by setting HAVE_ATTR_TEST to zero
perf tools: Make usage of test_attr__* optional for perf-sys.h
bpf: Allow narrow loads of bpf_sysctl fields with offset > 0
staging: Add VirtualBox guest shared folder (vboxsf) support
ceph: add missing check in d_revalidate snapdir handling
ceph: fix RCU case handling in ceph_d_revalidate()
ceph: fix use-after-free in __ceph_remove_cap()
bpf, doc: Add Andrii as official reviewer to BPF subsystem
ARM: sunxi: Fix CPU powerdown on A83T
ARM: dts: sun8i-a83t-tbs-a711: Fix WiFi resume from suspend
clk: sunxi-ng: a80: fix the zero'ing of bits 16 and 18
clk: sunxi: Fix operator precedence in sunxi_divs_clk_setup
clk: ast2600: Fix enabling of clocks
clk: at91: avoid sleeping early
ASoC: rockchip: rockchip_max98090: Enable SHDN to fix headset detection
ASoC: ti: sdma-pcm: Add back the flags parameter for non standard dma names
ASoC: SOF: ipc: Fix memory leak in sof_set_get_large_ctrl_data
ASoC: SOF: Fix memory leak in sof_dfsentry_write
ASoC: SOF: Intel: hda-stream: fix the CONFIG_ prefix missing
arm64: dts: imx8mn: fix compatible string for sdma
arm64: dts: imx8mm: fix compatible string for sdma
reset: fix reset_control_ops kerneldoc comment
clk: imx8m: Use SYS_PLL1_800M as intermediate parent of CLK_ARM
x86/tsx: Add config options to set tsx=on|off|auto
x86/speculation/taa: Add documentation for TSX Async Abort
x86/tsx: Add "auto" option to the tsx= cmdline parameter
kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
x86/speculation/taa: Add sysfs reporting for TSX Async Abort
x86/speculation/taa: Add mitigation for TSX Async Abort
x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
x86/cpu: Add a helper function x86_read_arch_cap_msr()
x86/msr: Add the IA32_TSX_CTRL MSR
ARM: dts: imx6-logicpd: Re-enable SNVS power key
iio: adc: stm32-adc: fix stopping dma
soc: imx: gpc: fix initialiser format
ARM: dts: imx6qdl-sabreauto: Fix storm of accelerometer interrupts
Btrfs: fix race leading to metadata space leak after task received signal
btrfs: tree-checker: Fix wrong check on max devid
btrfs: Consider system chunk array size for new SYSTEM chunks
pinctrl: cherryview: Allocate IRQ chip dynamic
clk: samsung: exynos5420: Preserve PLL configuration during suspend/resume
arm64: dts: ls1028a: fix a compatible issue
autofs: fix a leak in autofs_expire_indirect()
soundwire: slave: fix scanf format
reset: fix reset_control_get_exclusive kerneldoc comment
reset: fix reset_control_lookup kerneldoc comment
reset: fix of_reset_control_get_count kerneldoc comment
reset: fix of_reset_simple_xlate kerneldoc comment
ASoC: kirkwood: fix device remove ordering
ASoC: rsnd: dma: fix SSI9 4/5/6/7 busif dma address
ASoC: hdmi-codec: drop mutex locking again
ASoC: kirkwood: fix external clock probe defer
clk: samsung: exynos542x: Move G3D subsystem clocks to its sub-CMU
clk: samsung: exynos5433: Fix error paths
tools: gpio: Use !building_out_of_srctree to determine srctree
iio: imu: inv_mpu6050: fix no data on MPU6050
reset: Fix memory leak in reset_control_array_put()
aio: Fix io_pgetevents() struct __compat_aio_sigset layout
pinctrl: cherryview: Fix irq_valid_mask calculation
ASoC: compress: fix unsigned integer overflow check
ASoC: msm8916-wcd-analog: Fix RX1 selection in RDAC2 MUX
pinctrl: intel: Avoid potential glitches if pin is in GPIO mode
soundwire: intel: fix intel_register_dai PDI offsets and numbers
interconnect: Add locking in icc_set_tag()
interconnect: qcom: Fix icc_onecell_data allocation
clocksource/drivers/sh_mtu2: Do not loop using platform_get_irq_by_name()
fs/namespace.c: fix use-after-free of mount in mnt_warn_timestamp_expiry()
clocksource/drivers/mediatek: Fix error handling
soundwire: depend on ACPI || OF
soundwire: depend on ACPI
iio: srf04: fix wrong limitation in distance measuring
iio: imu: adis16480: make sure provided frequency is positive
cgroup: freezer: call cgroup_enter_frozen() with preemption disabled in ptrace_stop()
thunderbolt: Drop unnecessary read when writing LC command in Ice Lake
thunderbolt: Fix lockdep circular locking depedency warning
thunderbolt: Read DP IN adapter first two dwords in one go
clk: at91: sam9x60: fix programmable clock
clk: meson: g12a: set CLK_MUX_ROUND_CLOSEST on the cpu clock muxes
clk: meson: g12a: fix cpu clock rate setting
clk: meson: gxbb: let sar_adc_clk_div set the parent clock rate
XArray: Fix xas_next() with a single entry at 0
Change-Id: I594a629aca56b3ff5a224d65f0d8d79fe4f4f34b
Signed-off-by: Raghavendra Rao Ananta <rananta@codeaurora.org>
|
||
|
|
5df373e956 |
mm/page_io.c: do not free shared swap slots
The following race is observed due to which a processes faulting on a
swap entry, finds the page neither in swapcache nor swap. This causes
zram to give a zero filled page that gets mapped to the process,
resulting in a user space crash later.
Consider parent and child processes Pa and Pb sharing the same swap slot
with swap_count 2. Swap is on zram with SWP_SYNCHRONOUS_IO set.
Virtual address 'VA' of Pa and Pb points to the shared swap entry.
Pa Pb
fault on VA fault on VA
do_swap_page do_swap_page
lookup_swap_cache fails lookup_swap_cache fails
Pb scheduled out
swapin_readahead (deletes zram entry)
swap_free (makes swap_count 1)
Pb scheduled in
swap_readpage (swap_count == 1)
Takes SWP_SYNCHRONOUS_IO path
zram enrty absent
zram gives a zero filled page
Fix this by making sure that swap slot is freed only when swap count
drops down to one.
Link: http://lkml.kernel.org/r/1571743294-14285-1-git-send-email-vinmenon@codeaurora.org
Fixes:
|
||
|
|
b996ea53a5 |
mm: ratelimit swap write errors
Ratelimit the swap write errors to avoid logbuf getting filled up by these messages. These pages remove the useful page allocation failure messages (when zram is used as swap) from logbuf, thus making the debugging difficult. Change-Id: I9d4229a76b2551d7baca603112d53012d8722415 Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> |
||
|
|
4efaceb1c5 |
mm, swap: use rbtree for swap_extent
swap_extent is used to map swap page offset to backing device's block offset. For a continuous block range, one swap_extent is used and all these swap_extents are managed in a linked list. These swap_extents are used by map_swap_entry() during swap's read and write path. To find out the backing device's block offset for a page offset, the swap_extent list will be traversed linearly, with curr_swap_extent being used as a cache to speed up the search. This works well as long as swap_extents are not huge or when the number of processes that access swap device are few, but when the swap device has many extents and there are a number of processes accessing the swap device concurrently, it can be a problem. On one of our servers, the disk's remaining size is tight: $df -h Filesystem Size Used Avail Use% Mounted on ... ... /dev/nvme0n1p1 1.8T 1.3T 504G 72% /home/t4 When creating a 80G swapfile there, there are as many as 84656 swap extents. The end result is, kernel spends abou 30% time in map_swap_entry() and swap throughput is only 70MB/s. As a comparison, when I used smaller sized swapfile, like 4G whose swap_extent dropped to 2000, swap throughput is back to 400-500MB/s and map_swap_entry() is about 3%. One downside of using rbtree for swap_extent is, 'struct rbtree' takes 24 bytes while 'struct list_head' takes 16 bytes, that's 8 bytes more for each swap_extent. For a swapfile that has 80k swap_extents, that means 625KiB more memory consumed. Test: Since it's not possible to reboot that server, I can not test this patch diretly there. Instead, I tested it on another server with NVMe disk. I created a 20G swapfile on an NVMe backed XFS fs. By default, the filesystem is quite clean and the created swapfile has only 2 extents. Testing vanilla and this patch shows no obvious performance difference when swapfile is not fragmented. To see the patch's effects, I used some tweaks to manually fragment the swapfile by breaking the extent at 1M boundary. This made the swapfile have 20K extents. nr_task=4 kernel swapout(KB/s) map_swap_entry(perf) swapin(KB/s) map_swap_entry(perf) vanilla 165191 90.77% 171798 90.21% patched 858993 +420% 2.16% 715827 +317% 0.77% nr_task=8 kernel swapout(KB/s) map_swap_entry(perf) swapin(KB/s) map_swap_entry(perf) vanilla 306783 92.19% 318145 87.76% patched 954437 +211% 2.35% 1073741 +237% 1.57% swapout: the throughput of swap out, in KB/s, higher is better 1st map_swap_entry: cpu cycles percent sampled by perf swapin: the throughput of swap in, in KB/s, higher is better. 2nd map_swap_entry: cpu cycles percent sampled by perf nr_task=1 doesn't show any difference, this is due to the curr_swap_extent can be effectively used to cache the correct swap extent for single task workload. [akpm@linux-foundation.org: s/BUG_ON(1)/BUG()/] Link: http://lkml.kernel.org/r/20190523142404.GA181@aaronlu Signed-off-by: Aaron Lu <ziqian.lzq@antfin.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
8751853091 |
swap_readpage(): avoid blk_wake_io_task() if !synchronous
swap_readpage() sets waiter = bio->bi_private even if synchronous = F,
this means that the caller can get the spurious wakeup after return.
This can be fatal if blk_wake_io_task() does
set_current_state(TASK_RUNNING) after the caller does
set_special_state(), in the worst case the kernel can crash in
do_task_dead().
Link: http://lkml.kernel.org/r/20190704160301.GA5956@redhat.com
Fixes:
|
||
|
|
1a5f439c7c |
mm, swap: fix THP swap out
0-Day test system reported some OOM regressions for several THP (Transparent Huge Page) swap test cases. These regressions are bisected to |
||
|
|
b685a7350a |
mm/page_io.c: fix polled swap page in
swap_readpage() wants to do polling to bring in pages if asked to, but it doesn't mark the bio as being polled. Additionally, the looping around the blk_poll() check isn't correct - if we get a zero return, we should call io_schedule(), we can't just assume that the bio has completed. The regular bio->bi_private check should be used for that. Link: http://lkml.kernel.org/r/e15243a8-2cdf-c32c-ecee-f289377c8ef9@kernel.dk Signed-off-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
1ac5cd4978 |
block: don't use un-ordered __set_current_state(TASK_UNINTERRUPTIBLE)
This mostly reverts commit
|
||
|
|
6a7f6d86a5 |
blkcg: associate a blkg for pages being evicted by swap
A prior patch in this series added blkg association to bios issued by cgroups. There are two other paths that we want to attribute work back to the appropriate cgroup: swap and writeback. Here we modify the way swap tags bios to include the blkg. Writeback will be tackle in the next patch. Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
0a1b8b87d0 |
block: make blk_poll() take a parameter on whether to spin or not
blk_poll() has always kept spinning until it found an IO. This is fine for SYNC polling, since we need to find one request we have pending, but in preparation for ASYNC polling it can be beneficial to just check if we have any entries available or not. Existing callers are converted to pass in 'spin == true', to retain the old behavior. Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
849a370016 |
block: avoid ordered task state change for polled IO
For the core poll helper, the task state setting don't need to imply any atomics, as it's the current task itself that is being modified and we're not going to sleep. For IRQ driven, the wakeup path have the necessary barriers to not need us using the heavy handed version of the task state setting. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
0619317ff8 |
block: add polled wakeup task helper
If we're polling for IO on a device that doesn't use interrupts, then IO completion loop (and wake of task) is done by submitting task itself. If that is the case, then we don't need to enter the wake_up_process() function, we can simply mark ourselves as TASK_RUNNING. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
5f21585384 |
Merge tag 'for-linus-20181102' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"The biggest part of this pull request is the revert of the blkcg
cleanup series. It had one fix earlier for a stacked device issue, but
another one was reported. Rather than play whack-a-mole with this,
revert the entire series and try again for the next kernel release.
Apart from that, only small fixes/changes.
Summary:
- Indentation fixup for mtip32xx (Colin Ian King)
- The blkcg cleanup series revert (Dennis Zhou)
- Two NVMe fixes. One fixing a regression in the nvme request
initialization in this merge window, causing nvme-fc to not work.
The other is a suspend/resume p2p resource issue (James, Keith)
- Fix sg discard merge, allowing us to merge in cases where we didn't
before (Jianchao Wang)
- Call rq_qos_exit() after the queue is frozen, preventing a hang
(Ming)
- Fix brd queue setup, fixing an oops if we fail setting up all
devices (Ming)"
* tag 'for-linus-20181102' of git://git.kernel.dk/linux-block:
nvme-pci: fix conflicting p2p resource adds
nvme-fc: fix request private initialization
blkcg: revert blkcg cleanups series
block: brd: associate with queue until adding disk
block: call rq_qos_exit() after queue is frozen
mtip32xx: clean an indentation issue, remove extraneous tabs
block: fix the DISCARD request merge
|
||
|
|
9931a07d51 |
Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull AFS updates from Al Viro: "AFS series, with some iov_iter bits included" * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) missing bits of "iov_iter: Separate type from direction and use accessor functions" afs: Probe multiple fileservers simultaneously afs: Fix callback handling afs: Eliminate the address pointer from the address list cursor afs: Allow dumping of server cursor on operation failure afs: Implement YFS support in the fs client afs: Expand data structure fields to support YFS afs: Get the target vnode in afs_rmdir() and get a callback on it afs: Calc callback expiry in op reply delivery afs: Fix FS.FetchStatus delivery from updating wrong vnode afs: Implement the YFS cache manager service afs: Remove callback details from afs_callback_break struct afs: Commit the status on a new file/dir/symlink afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS afs: Don't invoke the server to read data beyond EOF afs: Add a couple of tracepoints to log I/O errors afs: Handle EIO from delivery function afs: Fix TTL on VL server and address lists afs: Implement VL server rotation afs: Improve FS server rotation error handling ... |
||
|
|
b5f2954d30 |
blkcg: revert blkcg cleanups series
This reverts a series committed earlier due to null pointer exception bug report in [1]. It seems there are edge case interactions that I did not consider and will need some time to understand what causes the adverse interactions. The original series can be found in [2] with a follow up series in [3]. [1] https://www.spinics.net/lists/cgroups/msg20719.html [2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/ [3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/ This reverts the following commits: |
||
|
|
bc4ae27d81 |
mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS
The SWP_FILE flag serves two purposes: to make swap_{read,write}page() go
through the filesystem, and to make swapoff() call ->swap_deactivate().
For Btrfs, we want the latter but not the former, so split this flag into
two. This makes us always call ->swap_deactivate() if ->swap_activate()
succeeded, not just if it didn't add any swap extents itself.
This also resolves the issue of the very misleading name of SWP_FILE,
which is only used for swap files over NFS.
Link: http://lkml.kernel.org/r/6d63d8668c4287a4f6d203d65696e96f80abdfc7.1536704650.git.osandov@fb.com
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Sterba <dsterba@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
aa563d7bca |
iov_iter: Separate type from direction and use accessor functions
In the iov_iter struct, separate the iterator type from the iterator direction and use accessor functions to access them in most places. Convert a bunch of places to use switch-statements to access them rather then chains of bitwise-AND statements. This makes it easier to add further iterator types. Also, this can be more efficient as to implement a switch of small contiguous integers, the compiler can use ~50% fewer compare instructions than it has to use bitwise-and instructions. Further, cease passing the iterator type into the iterator setup function. The iterator function can set that itself. Only the direction is required. Signed-off-by: David Howells <dhowells@redhat.com> |
||
|
|
74b7c02a9b |
blkcg: associate a blkg for pages being evicted by swap
A prior patch in this series added blkg association to bios issued by cgroups. There are two other paths that we want to attribute work back to the appropriate cgroup: swap and writeback. Here we modify the way swap tags bios to include the blkg. Writeback will be tackle in the next patch. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
0d3bd88d54 |
swap,blkcg: issue swap io with the appropriate context
For backcharging we need to know who the page belongs to when swapping it out. We don't worry about things that do ->rw_page (zram etc) at the moment, we're only worried about pages that actually go to a block device. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
0d1e0c7cd5 |
blk: introduce REQ_SWAP
Just like REQ_META, it's important to know the IO coming down is swap in order to guard against potential IO priority inversion issues with cgroups. Add REQ_SWAP and use it for all swap IO, and add it to our bio_issue_as_root_blkg helper. Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
263663cd3c |
block: convert to bio_first_bvec_all & bio_first_page_all
This patch converts to bio_first_bvec_all() & bio_first_page_all() for retrieving the 1st bvec/page, and prepares for supporting multipage bvec. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
0bcac06f27 |
mm, swap: skip swapcache for swapin of synchronous device
With fast swap storage, the platforms want to use swap more aggressively and swap-in is crucial to application latency. The rw_page() based synchronous devices like zram, pmem and btt are such fast storage. When I profile swapin performance with zram lz4 decompress test, S/W overhead is more than 70%. Maybe, it would be bigger in nvdimm. This patch aims to reduce swap-in latency by skipping swapcache if the swap device is synchronous device like rw_page based device. It enhances 45% my swapin test(5G sequential swapin, no readahead, from 2.41sec to 1.64sec). Link: http://lkml.kernel.org/r/1505886205-9671-5-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
e2c5923c34 |
Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block
Pull core block layer updates from Jens Axboe:
"This is the main pull request for block storage for 4.15-rc1.
Nothing out of the ordinary in here, and no API changes or anything
like that. Just various new features for drivers, core changes, etc.
In particular, this pull request contains:
- A patch series from Bart, closing the whole on blk/scsi-mq queue
quescing.
- A series from Christoph, building towards hidden gendisks (for
multipath) and ability to move bio chains around.
- NVMe
- Support for native multipath for NVMe (Christoph).
- Userspace notifications for AENs (Keith).
- Command side-effects support (Keith).
- SGL support (Chaitanya Kulkarni)
- FC fixes and improvements (James Smart)
- Lots of fixes and tweaks (Various)
- bcache
- New maintainer (Michael Lyle)
- Writeback control improvements (Michael)
- Various fixes (Coly, Elena, Eric, Liang, et al)
- lightnvm updates, mostly centered around the pblk interface
(Javier, Hans, and Rakesh).
- Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)
- Writeback series that fix the much discussed hundreds of millions
of sync-all units. This goes all the way, as discussed previously
(me).
- Fix for missing wakeup on writeback timer adjustments (Yafang
Shao).
- Fix laptop mode on blk-mq (me).
- {mq,name} tupple lookup for IO schedulers, allowing us to have
alias names. This means you can use 'deadline' on both !mq and on
mq (where it's called mq-deadline). (me).
- blktrace race fix, oopsing on sg load (me).
- blk-mq optimizations (me).
- Obscure waitqueue race fix for kyber (Omar).
- NBD fixes (Josef).
- Disable writeback throttling by default on bfq, like we do on cfq
(Luca Miccio).
- Series from Ming that enable us to treat flush requests on blk-mq
like any other request. This is a really nice cleanup.
- Series from Ming that improves merging on blk-mq with schedulers,
getting us closer to flipping the switch on scsi-mq again.
- BFQ updates (Paolo).
- blk-mq atomic flags memory ordering fixes (Peter Z).
- Loop cgroup support (Shaohua).
- Lots of minor fixes from lots of different folks, both for core and
driver code"
* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
nvme: fix visibility of "uuid" ns attribute
blk-mq: fixup some comment typos and lengths
ide: ide-atapi: fix compile error with defining macro DEBUG
blk-mq: improve tag waiting setup for non-shared tags
brd: remove unused brd_mutex
blk-mq: only run the hardware queue if IO is pending
block: avoid null pointer dereference on null disk
fs: guard_bio_eod() needs to consider partitions
xtensa/simdisk: fix compile error
nvme: expose subsys attribute to sysfs
nvme: create 'slaves' and 'holders' entries for hidden controllers
block: create 'slaves' and 'holders' entries for hidden gendisks
nvme: also expose the namespace identification sysfs files for mpath nodes
nvme: implement multipath access to nvme subsystems
nvme: track shared namespaces
nvme: introduce a nvme_ns_ids structure
nvme: track subsystems
block, nvme: Introduce blk_mq_req_flags_t
block, scsi: Make SCSI quiesce and resume work reliably
block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
...
|
||
|
|
ea435e1b93 |
block: add a poll_fn callback to struct request_queue
That we we can also poll non blk-mq queues. Mostly needed for the NVMe multipath code, but could also be useful elsewhere. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
b24413180f |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
|
a0725ab0c7 |
Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
"This is the first pull request for 4.14, containing most of the code
changes. It's a quiet series this round, which I think we needed after
the churn of the last few series. This contains:
- Fix for a registration race in loop, from Anton Volkov.
- Overflow complaint fix from Arnd for DAC960.
- Series of drbd changes from the usual suspects.
- Conversion of the stec/skd driver to blk-mq. From Bart.
- A few BFQ improvements/fixes from Paolo.
- CFQ improvement from Ritesh, allowing idling for group idle.
- A few fixes found by Dan's smatch, courtesy of Dan.
- A warning fixup for a race between changing the IO scheduler and
device remova. From David Jeffery.
- A few nbd fixes from Josef.
- Support for cgroup info in blktrace, from Shaohua.
- Also from Shaohua, new features in the null_blk driver to allow it
to actually hold data, among other things.
- Various corner cases and error handling fixes from Weiping Zhang.
- Improvements to the IO stats tracking for blk-mq from me. Can
drastically improve performance for fast devices and/or big
machines.
- Series from Christoph removing bi_bdev as being needed for IO
submission, in preparation for nvme multipathing code.
- Series from Bart, including various cleanups and fixes for switch
fall through case complaints"
* 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits)
kernfs: checking for IS_ERR() instead of NULL
drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set
drbd: Fix allyesconfig build, fix recent commit
drbd: switch from kmalloc() to kmalloc_array()
drbd: abort drbd_start_resync if there is no connection
drbd: move global variables to drbd namespace and make some static
drbd: rename "usermode_helper" to "drbd_usermode_helper"
drbd: fix race between handshake and admin disconnect/down
drbd: fix potential deadlock when trying to detach during handshake
drbd: A single dot should be put into a sequence.
drbd: fix rmmod cleanup, remove _all_ debugfs entries
drbd: Use setup_timer() instead of init_timer() to simplify the code.
drbd: fix potential get_ldev/put_ldev refcount imbalance during attach
drbd: new disk-option disable-write-same
drbd: Fix resource role for newly created resources in events2
drbd: mark symbols static where possible
drbd: Send P_NEG_ACK upon write error in protocol != C
drbd: add explicit plugging when submitting batches
drbd: change list_for_each_safe to while(list_first_entry_or_null)
drbd: introduce drbd_recv_header_maybe_unplug
...
|
||
|
|
225311a464 |
mm: test code to write THP to swap device as a whole
To support delay splitting THP (Transparent Huge Page) after swapped out, we need to enhance swap writing code to support to write a THP as a whole. This will improve swap write IO performance. As Ming Lei <ming.lei@redhat.com> pointed out, this should be based on multipage bvec support, which hasn't been merged yet. So this patch is only for testing the functionality of the other patches in the series. And will be reimplemented after multipage bvec support is merged. Link: http://lkml.kernel.org/r/20170724051840.2309-7-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Ross Zwisler <ross.zwisler@intel.com> [for brd.c, zram_drv.c, pmem.c] Cc: Shaohua Li <shli@kernel.org> Cc: Vishal L Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
74d46992e0 |
block: replace bi_bdev with a gendisk pointer and partitions index
This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
b0ba2d0faf |
mm/page_io.c: fix oops during block io poll in swapin path
When a thread is OOM-killed during swap_readpage() operation, an oops
occurs because end_swap_bio_read() is calling wake_up_process() based on
an assumption that the thread which called swap_readpage() is still
alive.
Out of memory: Kill process 525 (polkitd) score 0 or sacrifice child
Killed process 525 (polkitd) total-vm:528128kB, anon-rss:0kB, file-rss:4kB, shmem-rss:0kB
oom_reaper: reaped process 525 (polkitd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter coretemp ppdev pcspkr vmw_balloon sg shpchp vmw_vmci parport_pc parport i2c_piix4 ip_tables xfs libcrc32c sd_mod sr_mod cdrom ata_generic pata_acpi vmwgfx ahci libahci drm_kms_helper ata_piix syscopyarea sysfillrect sysimgblt fb_sys_fops mptspi scsi_transport_spi ttm e1000 mptscsih drm mptbase i2c_core libata serio_raw
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0-rc2-next-20170725 #129
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
task: ffffffffb7c16500 task.stack: ffffffffb7c00000
RIP: 0010:__lock_acquire+0x151/0x12f0
Call Trace:
<IRQ>
lock_acquire+0x59/0x80
_raw_spin_lock_irqsave+0x3b/0x4f
try_to_wake_up+0x3b/0x410
wake_up_process+0x10/0x20
end_swap_bio_read+0x6f/0xf0
bio_endio+0x92/0xb0
blk_update_request+0x88/0x270
scsi_end_request+0x32/0x1c0
scsi_io_completion+0x209/0x680
scsi_finish_command+0xd4/0x120
scsi_softirq_done+0x120/0x140
__blk_mq_complete_request_remote+0xe/0x10
flush_smp_call_function_queue+0x51/0x120
generic_smp_call_function_single_interrupt+0xe/0x20
smp_trace_call_function_single_interrupt+0x22/0x30
smp_call_function_single_interrupt+0x9/0x10
call_function_single_interrupt+0xa7/0xb0
</IRQ>
RIP: 0010:native_safe_halt+0x6/0x10
default_idle+0xe/0x20
arch_cpu_idle+0xa/0x10
default_idle_call+0x1e/0x30
do_idle+0x187/0x200
cpu_startup_entry+0x6e/0x70
rest_init+0xd0/0xe0
start_kernel+0x456/0x477
x86_64_start_reservations+0x24/0x26
x86_64_start_kernel+0xf7/0x11a
secondary_startup_64+0xa5/0xa5
Code: c3 49 81 3f 20 9e 0b b8 41 bc 00 00 00 00 44 0f 45 e2 83 fe 01 0f 87 62 ff ff ff 89 f0 49 8b 44 c7 08 48 85 c0 0f 84 52 ff ff ff <f0> ff 80 98 01 00 00 8b 3d 5a 49 c4 01 45 8b b3 18 0c 00 00 85
RIP: __lock_acquire+0x151/0x12f0 RSP: ffffa01f39e03c50
---[ end trace 6c441db499169b1e ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x36000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt
Fix it by holding a reference to the thread.
[akpm@linux-foundation.org: add comment]
Fixes:
|
||
|
|
23955622ff |
swap: add block io poll in swapin path
For fast flash disk, async IO could introduce overhead because of context switch. block-mq now supports IO poll, which improves performance and latency a lot. swapin is a good place to use this technique, because the task is waiting for the swapin page to continue execution. In my virtual machine, directly read 4k data from a NVMe with iopoll is about 60% better than that without poll. With iopoll support in swapin patch, my microbenchmark (a task does random memory write) is about 10%~25% faster. CPU utilization increases a lot though, 2x and even 3x CPU utilization. This will depend on disk speed. While iopoll in swapin isn't intended for all usage cases, it's a win for latency sensistive workloads with high speed swap disk. block layer has knob to control poll in runtime. If poll isn't enabled in block layer, there should be no noticeable change in swapin. I got a chance to run the same test in a NVMe with DRAM as the media. In simple fio IO test, blkpoll boosts 50% performance in single thread test and ~20% in 8 threads test. So this is the base line. In above swap test, blkpoll boosts ~27% performance in single thread test. blkpoll uses 2x CPU time though. If we enable hybid polling, the performance gain has very slight drop but CPU time is only 50% worse than that without blkpoll. Also we can adjust parameter of hybid poll, with it, the CPU time penality is reduced further. In 8 threads test, blkpoll doesn't help though. The performance is similar to that without blkpoll, but cpu utilization is similar too. There is lock contention in swap path. The cpu time spending on blkpoll isn't high. So overall, blkpoll swapin isn't worse than that without it. The swapin readahead might read several pages in in the same time and form a big IO request. Since the IO will take longer time, it doesn't make sense to do poll, so the patch only does iopoll for single page swapin. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/070c3c3e40b711e7b1390002c991e86a-b5408f0@7511894063d3764ff01ea8111f5a004d7dd700ed078797c204a24e620ddb965c Signed-off-by: Shaohua Li <shli@fb.com> Cc: Tim Chen <tim.c.chen@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jens Axboe <axboe@fb.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
4e4cbee93d |
block: switch bios to blk_status_t
Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com> |
||
|
|
7637241e65 |
writeback: add wbc_to_write_flags()
Add wbc_to_write_flags(), which returns the write modifier flags to use, based on a struct writeback_control. No functional changes in this patch, but it prepares us for factoring other wbc fields for write type. Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> |
||
|
|
cc30c5d646 |
mm/page_io.c: replace some BUG_ON()s with VM_BUG_ON_PAGE()
So they are CONFIG_DEBUG_VM-only and more informative. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David S. Miller <davem@davemloft.net> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@fb.com> Cc: Joe Perches <joe@perches.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rik van Riel <riel@redhat.com> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
c8de641b1e |
mm: fix the page_swap_info() BUG_ON check
Commit
|
||
|
|
ba13e83ec3 |
mm: make __swap_writepage() use bio_set_op_attrs()
Cleaner than manipulating bio->bi_rw flags directly. Signed-off-by: Jens Axboe <axboe@fb.com> |
||
|
|
7e4411bfe6 |
mm: add cond_resched() to generic_swapfile_activate()
generic_swapfile_activate() can take quite long time, it iterates over all blocks of a file, so add cond_resched to it. I observed about 1 second stalls when activating a swapfile that was almost unfragmented - this patch fixes it. Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1607221710580.4818@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
95fe6c1a20 |
block, fs, mm, drivers: use bio set/get op accessors
This patch converts the simple bi_rw use cases in the block, drivers, mm and fs code to set/get the bio operation using bio_set_op_attrs/bio_op These should be simple one or two liner cases, so I just did them in one patch. The next patches handle the more complicated cases in a module per patch. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com> |
||
|
|
4e49ea4a3d |
block/fs/drivers: remove rw argument from submit_bio
This has callers of submit_bio/submit_bio_wait set the bio->bi_rw instead of passing it in. This makes that use the same as generic_make_request and how we set the other bio fields. Signed-off-by: Mike Christie <mchristi@redhat.com> Fixed up fs/ext4/crypto.c Signed-off-by: Jens Axboe <axboe@fb.com> |
||
|
|
c2e7b20705 |
Merge branch 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs cleanups from Al Viro: "More cleanups from Christoph" * 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: nfsd: use RWF_SYNC fs: add RWF_DSYNC aand RWF_SYNC ceph: use generic_write_sync fs: simplify the generic_write_sync prototype fs: add IOCB_SYNC and IOCB_DSYNC direct-io: remove the offset argument to dio_complete direct-io: eliminate the offset argument to ->direct_IO xfs: eliminate the pos variable in xfs_file_dio_aio_write filemap: remove the pos argument to generic_file_direct_write filemap: remove pos variables in generic_file_read_iter |
||
|
|
c8b8e32d70 |
direct-io: eliminate the offset argument to ->direct_IO
Including blkdev_direct_IO and dax_do_io. It has to be ki_pos to actually work, so eliminate the superflous argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
b06bad17c7 |
mm: call swap_slot_free_notify() with page lock held
Kyeongdon reported below error which is BUG_ON(!PageSwapCache(page)) in
page_swap_info. The reason is that page_endio in rw_page unlocks the
page if read I/O is completed so we need to hold a PG_lock again to
check PageSwapCache. Otherwise, the page can be removed from swapcache.
Kernel BUG at c00f9040 [verbose debug info unavailable]
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 4 PID: 13446 Comm: RenderThread Tainted: G W 3.10.84-g9f14aec-dirty #73
task: c3b73200 ti: dd192000 task.ti: dd192000
PC is at page_swap_info+0x10/0x2c
LR is at swap_slot_free_notify+0x18/0x6c
pc : [<c00f9040>] lr : [<c00f5560>] psr: 400f0113
sp : dd193d78 ip : c2deb1e4 fp : da015180
r10: 00000000 r9 : 000200da r8 : c120fe08
r7 : 00000000 r6 : 00000000 r5 : c249a6c0 r4 : = c249a6c0
r3 : 00000000 r2 : 40080009 r1 : 200f0113 r0 : = c249a6c0
..<snip> ..
Call Trace:
page_swap_info+0x10/0x2c
swap_slot_free_notify+0x18/0x6c
swap_readpage+0x90/0x11c
read_swap_cache_async+0x134/0x1ac
swapin_readahead+0x70/0xb0
handle_pte_fault+0x320/0x6fc
handle_mm_fault+0xc0/0xf0
do_page_fault+0x11c/0x36c
do_DataAbort+0x34/0x118
Fixes:
|
||
|
|
09cbfeaf1a |
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
3f2b1a04f4 |
zram: revive swap_slot_free_notify
Commit |
||
|
|
1170532bb4 |
mm: convert printk(KERN_<LEVEL> to pr_<level>
Most of the mm subsystem uses pr_<level> so make it consistent. Miscellanea: - Realign arguments - Add missing newline to format - kmemleak-test.c has a "kmemleak: " prefix added to the "Kmemleak testing" logging message via pr_fmt Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Tejun Heo <tj@kernel.org> [percpu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
6cf66b4caf |
fs: use helper bio_add_page() instead of open coding on bi_io_vec
Call pre-defined helper bio_add_page() instead of open coding for iterating through bi_io_vec[]. Doing that, it's possible to make some parts in filesystems and mm/page_io.c simpler than before. Acked-by: Dave Kleikamp <shaggy@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> [dpark: add more description in commit message] Signed-off-by: Dongsu Park <dpark@posteo.net> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com> |
||
|
|
4246a0b63b |
block: add a bi_error field to struct bio
Currently we have two different ways to signal an I/O error on a BIO: (1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the bi_end_io callback The first one has the drawback of only communicating a single possible error (-EIO), and the second one has the drawback of not beeing persistent when bios are queued up, and are not passed along from child to parent bio in the ever more popular chaining scenario. Having both mechanisms available has the additional drawback of utterly confusing driver authors and introducing bugs where various I/O submitters only deal with one of them, and the others have to add boilerplate code to deal with both kinds of error returns. So add a new bi_error field to store an errno value directly in struct bio and remove the existing mechanisms to clean all this up. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com> |
||
|
|
343df3c79c |
suspend: simplify block I/O handling
Stop abusing struct page functionality and the swap end_io handler, and instead add a modified version of the blk-lib.c bio_batch helpers. Also move the block I/O code into swap.c as they are directly tied into each other. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Pavel Machek <pavel@ucw.cz> Tested-by: Ming Lin <mlin@kernel.org> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Jens Axboe <axboe@fb.com> |
||
|
|
22c6186ece |
direct_IO: remove rw from a_ops->direct_IO()
Now that no one is using rw, remove it completely. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
e2e40f2c1e |
fs: move struct kiocb to fs.h
struct kiocb now is a generic I/O container, so move it to fs.h. Also do a #include diet for aio.h while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
|
|
66ee59af63 |
fs: remove ki_nbytes
There is no need to pass the total request length in the kiocb, as we already get passed in through the iov_iter argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |