With CFI, a callback function passed to __queue_delayed_work from a
module can point to a jump table entry defined in the module instead
of the one used in the core kernel, which breaks this test:
WARN_ON_ONCE(timer->function != delayed_work_timer_fn);
To work around the problem, disable the warning when CFI and modules
are both enabled.
Bug: 145210207
Change-Id: I2a631ea3da9e401af38accf1001082b93b9b3443
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Changes in 5.10.7
i40e: Fix Error I40E_AQ_RC_EINVAL when removing VFs
iavf: fix double-release of rtnl_lock
net/sched: sch_taprio: ensure to reset/destroy all child qdiscs
net: mvpp2: Add TCAM entry to drop flow control pause frames
net: mvpp2: prs: fix PPPoE with ipv6 packet parse
net: systemport: set dev->max_mtu to UMAC_MAX_MTU_SIZE
ethernet: ucc_geth: fix use-after-free in ucc_geth_remove()
ethernet: ucc_geth: set dev->max_mtu to 1518
ionic: account for vlan tag len in rx buffer len
atm: idt77252: call pci_disable_device() on error path
net: mvpp2: Fix GoP port 3 Networking Complex Control configurations
net: stmmac: dwmac-meson8b: ignore the second clock input
ibmvnic: fix login buffer memory leak
ibmvnic: continue fatal error reset after passive init
net: ethernet: mvneta: Fix error handling in mvneta_probe
qede: fix offload for IPIP tunnel packets
virtio_net: Fix recursive call to cpus_read_lock()
net/ncsi: Use real net-device for response handler
net: ethernet: Fix memleak in ethoc_probe
net-sysfs: take the rtnl lock when storing xps_cpus
net-sysfs: take the rtnl lock when accessing xps_cpus_map and num_tc
net-sysfs: take the rtnl lock when storing xps_rxqs
net-sysfs: take the rtnl lock when accessing xps_rxqs_map and num_tc
net: ethernet: ti: cpts: fix ethtool output when no ptp_clock registered
tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGS
e1000e: Only run S0ix flows if shutdown succeeded
e1000e: bump up timeout to wait when ME un-configures ULP mode
Revert "e1000e: disable s0ix entry and exit flows for ME systems"
e1000e: Export S0ix flags to ethtool
bnxt_en: Check TQM rings for maximum supported value.
net: mvpp2: fix pkt coalescing int-threshold configuration
bnxt_en: Fix AER recovery.
ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()
net: sched: prevent invalid Scell_log shift count
net: hns: fix return value check in __lb_other_process()
erspan: fix version 1 check in gre_parse_header()
net: hdlc_ppp: Fix issues when mod_timer is called while timer is running
bareudp: set NETIF_F_LLTX flag
bareudp: Fix use of incorrect min_headroom size
vhost_net: fix ubuf refcount incorrectly when sendmsg fails
r8169: work around power-saving bug on some chip versions
net: dsa: lantiq_gswip: Enable GSWIP_MII_CFG_EN also for internal PHYs
net: dsa: lantiq_gswip: Fix GSWIP_MII_CFG(p) register access
CDC-NCM: remove "connected" log message
ibmvnic: fix: NULL pointer dereference.
net: usb: qmi_wwan: add Quectel EM160R-GL
selftests: mlxsw: Set headroom size of correct port
stmmac: intel: Add PCI IDs for TGL-H platform
selftests/vm: fix building protection keys test
block: add debugfs stanza for QUEUE_FLAG_NOWAIT
workqueue: Kick a worker based on the actual activation of delayed works
scsi: ufs: Fix wrong print message in dev_err()
scsi: ufs-pci: Fix restore from S4 for Intel controllers
scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers
scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
scsi: block: Introduce BLK_MQ_REQ_PM
scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT
scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
local64.h: make <asm/local64.h> mandatory
lib/genalloc: fix the overflow when size is too big
depmod: handle the case of /sbin/depmod without /sbin in PATH
scsi: ufs: Clear UAC for FFU and RPMB LUNs
kbuild: don't hardcode depmod path
Bluetooth: revert: hci_h5: close serdev device and free hu in h5_close
scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT
scsi: block: Do not accept any requests while suspended
crypto: ecdh - avoid buffer overflow in ecdh_set_secret()
crypto: asym_tpm: correct zero out potential secrets
powerpc: Handle .text.{hot,unlikely}.* in linker script
Staging: comedi: Return -EFAULT if copy_to_user() fails
staging: mt7621-dma: Fix a resource leak in an error handling path
usb: gadget: enable super speed plus
USB: cdc-acm: blacklist another IR Droid device
USB: cdc-wdm: Fix use after free in service_outstanding_interrupt().
usb: typec: intel_pmc_mux: Configure HPD first for HPD+IRQ request
usb: dwc3: meson-g12a: disable clk on error handling path in probe
usb: dwc3: gadget: Restart DWC3 gadget when enabling pullup
usb: dwc3: gadget: Clear wait flag on dequeue
usb: dwc3: ulpi: Use VStsDone to detect PHY regs access completion
usb: dwc3: ulpi: Replace CPU-based busyloop with Protocol-based one
usb: dwc3: ulpi: Fix USB2.0 HS/FS/LS PHY suspend regression
usb: chipidea: ci_hdrc_imx: add missing put_device() call in usbmisc_get_init_data()
USB: xhci: fix U1/U2 handling for hardware with XHCI_INTEL_HOST quirk set
usb: usbip: vhci_hcd: protect shift size
usb: uas: Add PNY USB Portable SSD to unusual_uas
USB: serial: iuu_phoenix: fix DMA from stack
USB: serial: option: add LongSung M5710 module support
USB: serial: option: add Quectel EM160R-GL
USB: yurex: fix control-URB timeout handling
USB: usblp: fix DMA to stack
ALSA: usb-audio: Fix UBSAN warnings for MIDI jacks
usb: gadget: select CONFIG_CRC32
USB: Gadget: dummy-hcd: Fix shift-out-of-bounds bug
usb: gadget: f_uac2: reset wMaxPacketSize
usb: gadget: function: printer: Fix a memory leak for interface descriptor
usb: gadget: u_ether: Fix MTU size mismatch with RX packet size
USB: gadget: legacy: fix return error code in acm_ms_bind()
usb: gadget: Fix spinlock lockup on usb_function_deactivate
usb: gadget: configfs: Preserve function ordering after bind failure
usb: gadget: configfs: Fix use-after-free issue with udc_name
USB: serial: keyspan_pda: remove unused variable
hwmon: (amd_energy) fix allocation of hwmon_channel_info config
mm: make wait_on_page_writeback() wait for multiple pending writebacks
x86/mm: Fix leak of pmd ptlock
KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte()
KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE
kvm: check tlbs_dirty directly
KVM: x86/mmu: Ensure TDP MMU roots are freed after yield
x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR
x86/resctrl: Don't move a task to the same resource group
blk-iocost: fix NULL iocg deref from racing against initialization
ALSA: hda/via: Fix runtime PM for Clevo W35xSS
ALSA: hda/conexant: add a new hda codec CX11970
ALSA: hda/realtek - Fix speaker volume control on Lenovo C940
ALSA: hda/realtek: Add mute LED quirk for more HP laptops
ALSA: hda/realtek: Enable mute and micmute LED on HP EliteBook 850 G7
ALSA: hda/realtek: Add two "Intel Reference board" SSID in the ALC256.
iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev
btrfs: qgroup: don't try to wait flushing if we're already holding a transaction
btrfs: send: fix wrong file path when there is an inode with a pending rmdir
Revert "device property: Keep secondary firmware node secondary by type"
dmabuf: fix use-after-free of dmabuf's file->f_inode
arm64: link with -z norelro for LLD or aarch64-elf
drm/i915: clear the shadow batch
drm/i915: clear the gpu reloc batch
bcache: fix typo from SUUP to SUPP in features.h
bcache: check unsupported feature sets for bcache register
bcache: introduce BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE for large bucket
net/mlx5e: Fix SWP offsets when vlan inserted by driver
ARM: dts: OMAP3: disable AES on N950/N9
netfilter: x_tables: Update remaining dereference to RCU
netfilter: ipset: fix shift-out-of-bounds in htable_bits()
netfilter: xt_RATEEST: reject non-null terminated string from userspace
netfilter: nft_dynset: report EOPNOTSUPP on missing set feature
dmaengine: idxd: off by one in cleanup code
x86/mtrr: Correct the range check before performing MTRR type lookups
KVM: x86: fix shift out of bounds reported by UBSAN
xsk: Fix memory leak for failed bind
rtlwifi: rise completion at the last step of firmware callback
scsi: target: Fix XCOPY NAA identifier lookup
Linux 5.10.7
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I1a7c195af35831fe362b027fe013c0c7e4dc20ea
[ Upstream commit 01341fbd0d8d4e717fc1231cdffe00343088ce0b ]
In realtime scenario, We do not want to have interference on the
isolated cpu cores. but when invoking alloc_workqueue() for percpu wq
on the housekeeping cpu, it kick a kworker on the isolated cpu.
alloc_workqueue
pwq_adjust_max_active
wake_up_worker
The comment in pwq_adjust_max_active() said:
"Need to kick a worker after thawed or an unbound wq's
max_active is bumped"
So it is unnecessary to kick a kworker for percpu's wq when invoking
alloc_workqueue(). this patch only kick a worker based on the actual
activation of delayed works.
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Export workqueue_execute_start/end tracepoints, so that vendor modules
can register probes for these tracepoints.
Bug: 175936268
Change-Id: Ib4c8f39ff8305a1d52fbca9d06b5e792396a3a2d
Signed-off-by: Changki Kim <changki.kim@samsung.com>
Steps on the way to 5.10-rc1
Resolves conflicts in:
fs/userfaultfd.c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie3fe3c818f1f6565cfd4fa551de72d2b72ef60af
As warned by Sphinx:
./Documentation/core-api/workqueue:400: ./kernel/workqueue.c:1218: WARNING: Unexpected indentation.
the return code table is currently not recognized, as it lacks
markups.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
- Add the hook to provide additional information like
a task scheduling log.
Bug: 169374262
Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
Change-Id: I203dbc6faa77687ea48769f76658d28b29ef46fd
(cherry picked from commit 2ea974a00c7bdbbee140d68d8867ddcbfb529ecc)
This reverts commit fc33a8fd54 as CFI is
being removed from the tree to come back later as a "clean" set of
patches.
Bug: 145210207
Cc: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie2c41854aa7613c7466dda6e88b3ce4b48460b80
Any runtime WARN_ON() has to be fixed, and BUILD_BUG_ON() can
help you nitice it earlier.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
This is no point to unlock() and then lock() the same mutex
back to back.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
008847f66c ("workqueue: allow rescuer thread to do more work.") made
the rescuer worker requeue the pwq immediately if there may be more
work items which need rescuing instead of waiting for the next mayday
timer expiration. Unfortunately, it checks only whether the pool needs
help from rescuers, but it doesn't check whether the pwq has work items
in the pool (the real reason that this rescuer can help for the pool).
The patch adds the check and void unneeded requeuing.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The workqueue code has it's internal spinlocks (pool::lock), which
are acquired on most workqueue operations. These spinlocks are
converted to 'sleeping' spinlocks on a RT-kernel.
Workqueue functions can be invoked from contexts which are truly atomic
even on a PREEMPT_RT enabled kernel. Taking sleeping locks from such
contexts is forbidden.
The pool::lock hold times are bound and the code sections are
relatively short, which allows to convert pool::lock and as a
consequence wq_mayday_lock to raw spinlocks which are truly spinning
locks even on a PREEMPT_RT kernel.
With the previous conversion of the manager waitqueue to a simple
waitqueue workqueues are now fully RT compliant.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The workqueue code has it's internal spinlock (pool::lock) and also
implicit spinlock usage in the wq_manager waitqueue. These spinlocks
are converted to 'sleeping' spinlocks on a RT-kernel.
Workqueue functions can be invoked from contexts which are truly atomic
even on a PREEMPT_RT enabled kernel. Taking sleeping locks from such
contexts is forbidden.
pool::lock can be converted to a raw spinlock as the lock held times
are short. But the workqueue manager waitqueue is handled inside of
pool::lock held regions which again violates the lock nesting rules
of raw and regular spinlocks.
The manager waitqueue has no special requirements like custom wakeup
callbacks or mass wakeups. While it does not use exclusive wait mode
explicitly there is no strict requirement to queue the waiters in a
particular order as there is only one waiter at a time.
This allows to replace the waitqueue with rcuwait which solves the
locking problem because rcuwait relies on existing locking.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
The data structure member "wq->rescuer" was reset to a null pointer
in one if branch. It was passed to a call of the function "kfree"
in the callback function "rcu_free_wq" (which was eventually executed).
The function "kfree" does not perform more meaningful data processing
for a passed null pointer (besides immediately returning from such a call).
Thus delete this function call which became unnecessary with the referenced
software update.
Fixes: def98c84b6 ("workqueue: Fix spurious sanity check failures in destroy_workqueue()")
Suggested-by: Markus Elfring <Markus.Elfring@web.de>
Signed-off-by: Zhang Qiang <qiang.zhang@windriver.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
We need to preserve error code before freeing "rescuer".
Fixes: f187b6974f ("workqueue: Use IS_ERR and PTR_ERR instead of PTR_ERR_OR_ZERO.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Replace inline function PTR_ERR_OR_ZERO with IS_ERR and PTR_ERR to
remove redundant parameter definitions and checks.
Reduce code size.
Before:
text data bss dec hex filename
47510 5979 840 54329 d439 kernel/workqueue.o
After:
text data bss dec hex filename
47474 5979 840 54293 d415 kernel/workqueue.o
Signed-off-by: Sean Fu <fxinrong@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The kernel test robot triggered a warning with the following race:
task-ctx A interrupt-ctx B
worker
-> process_one_work()
-> work_item()
-> schedule();
-> sched_submit_work()
-> wq_worker_sleeping()
-> ->sleeping = 1
atomic_dec_and_test(nr_running)
__schedule(); *interrupt*
async_page_fault()
-> local_irq_enable();
-> schedule();
-> sched_submit_work()
-> wq_worker_sleeping()
-> if (WARN_ON(->sleeping)) return
-> __schedule()
-> sched_update_worker()
-> wq_worker_running()
-> atomic_inc(nr_running);
-> ->sleeping = 0;
-> sched_update_worker()
-> wq_worker_running()
if (!->sleeping) return
In this context the warning is pointless everything is fine.
An interrupt before wq_worker_sleeping() will perform the ->sleeping
assignment (0 -> 1 > 0) twice.
An interrupt after wq_worker_sleeping() will trigger the warning and
nr_running will be decremented (by A) and incremented once (only by B, A
will skip it). This is the case until the ->sleeping is zeroed again in
wq_worker_running().
Remove the WARN statement because this condition may happen. Document
that preemption around wq_worker_sleeping() needs to be disabled to
protect ->sleeping and not just as an optimisation.
Fixes: 6d25be5782 ("sched/core, workqueues: Distangle worker accounting from rq lock")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Link: https://lkml.kernel.org/r/20200327074308.GY11705@shao2-debian
Pull workqueue updates from Tejun Heo:
"Nothing too interesting. Just two trivial patches"
* 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Mark up unlocked access to wq->first_flusher
workqueue: Make workqueue_init*() return void
A previous change added a test on the wrong config flag; rename
CFI to CFI_CLANG.
Bug: 145210207
Change-Id: Id8aead2eb2c75ad6442d10165f6cb86ccfb9c2f9
Signed-off-by: Alistair Delva <adelva@google.com>
[ 7329.671518] BUG: KCSAN: data-race in flush_workqueue / flush_workqueue
[ 7329.671549]
[ 7329.671572] write to 0xffff8881f65fb250 of 8 bytes by task 37173 on cpu 2:
[ 7329.671607] flush_workqueue+0x3bc/0x9b0 (kernel/workqueue.c:2844)
[ 7329.672527]
[ 7329.672540] read to 0xffff8881f65fb250 of 8 bytes by task 37175 on cpu 0:
[ 7329.672571] flush_workqueue+0x28d/0x9b0 (kernel/workqueue.c:2835)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
wq_select_unbound_cpu() is designed for unbound workqueues only, but
it's wrongly called when using a bound workqueue too.
Fixing this ensures work queued to a bound workqueue with
cpu=WORK_CPU_UNBOUND always runs on the local CPU.
Before, that would happen only if wq_unbound_cpumask happened to include
it (likely almost always the case), or was empty, or we got lucky with
forced round-robin placement. So restricting
/sys/devices/virtual/workqueue/cpumask to a small subset of a machine's
CPUs would cause some bound work items to run unexpectedly there.
Fixes: ef55718044 ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs")
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Hillf Danton <hdanton@sina.com>
[dj: massage changelog]
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
The return values of workqueue_init() and workqueue_early_int() are
always 0, and there is no usage of their return value. So just make
them return void.
Signed-off-by: Yu Chen <chen.yu@easystack.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
With non-canonical CFI, LLVM generates jump table entries for external
symbols in modules and as a result, a function pointer passed from a
module to the core kernel will have a different address.
Disable the warning for now.
Bug: 145210207
Change-Id: Ifdcee3479280f7b97abdee6b4c746f447e0944e6
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Alistair Delva <adelva@google.com>
Pull scheduler updates from Ingo Molnar:
"These were the main changes in this cycle:
- More -rt motivated separation of CONFIG_PREEMPT and
CONFIG_PREEMPTION.
- Add more low level scheduling topology sanity checks and warnings
to filter out nonsensical topologies that break scheduling.
- Extend uclamp constraints to influence wakeup CPU placement
- Make the RT scheduler more aware of asymmetric topologies and CPU
capacities, via uclamp metrics, if CONFIG_UCLAMP_TASK=y
- Make idle CPU selection more consistent
- Various fixes, smaller cleanups, updates and enhancements - please
see the git log for details"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
sched/fair: Define sched_idle_cpu() only for SMP configurations
sched/topology: Assert non-NUMA topology masks don't (partially) overlap
idle: fix spelling mistake "iterrupts" -> "interrupts"
sched/fair: Remove redundant call to cpufreq_update_util()
sched/psi: create /proc/pressure and /proc/pressure/{io|memory|cpu} only when psi enabled
sched/fair: Fix sgc->{min,max}_capacity calculation for SD_OVERLAP
sched/fair: calculate delta runnable load only when it's needed
sched/cputime: move rq parameter in irqtime_account_process_tick
stop_machine: Make stop_cpus() static
sched/debug: Reset watchdog on all CPUs while processing sysrq-t
sched/core: Fix size of rq::uclamp initialization
sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
sched/fair: Load balance aggressively for SCHED_IDLE CPUs
sched/fair : Improve update_sd_pick_busiest for spare capacity case
watchdog: Remove soft_lockup_hrtimer_cnt and related code
sched/rt: Make RT capacity-aware
sched/fair: Make EAS wakeup placement consider uclamp restrictions
sched/fair: Make task_fits_capacity() consider uclamp restrictions
sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with()
sched/uclamp: Make uclamp util helpers use and return UL values
...
It's surprising that workqueue_execute_end includes only the work when
its counterpart workqueue_execute_start has both the work and the worker
function.
You can't set a tracing filter or trigger based on the function, and
postprocessing scripts interested in specific functions are harder to
write since they have to remember the work from _start and match it up
with the same field in _end.
Add the function name, taking care to use the copy stashed in the
worker since the work is no longer safe to touch.
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Pull RCU updates from Ingo Molnar:
"The main changes in this cycle were:
- Dynamic tick (nohz) updates, perhaps most notably changes to force
the tick on when needed due to lengthy in-kernel execution on CPUs
on which RCU is waiting.
- Linux-kernel memory consistency model updates.
- Replace rcu_swap_protected() with rcu_prepace_pointer().
- Torture-test updates.
- Documentation updates.
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer()
net/sched: Replace rcu_swap_protected() with rcu_replace_pointer()
net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer()
net/core: Replace rcu_swap_protected() with rcu_replace_pointer()
bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer()
fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer()
drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer()
drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer()
x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer()
rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer()
rcu: Suppress levelspread uninitialized messages
rcu: Fix uninitialized variable in nocb_gp_wait()
rcu: Update descriptions for rcu_future_grace_period tracepoint
rcu: Update descriptions for rcu_nocb_wake tracepoint
rcu: Remove obsolete descriptions for rcu_barrier tracepoint
rcu: Ensure that ->rcu_urgent_qs is set before resched IPI
workqueue: Convert for_each_wq to use built-in list check
rcu: Several rcu_segcblist functions can be static
rcu: Remove unused function hlist_bl_del_init_rcu()
Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch()
...
An additional check has been recently added to ensure that a RCU related lock
is held while the RCU list is iterated.
The `pwqs' are sometimes iterated without a RCU lock but with the &wq->mutex
acquired leading to a warning.
Teach list_for_each_entry_rcu() that the RCU usage is okay if &wq->mutex
is acquired during the list traversal.
Fixes: 28875945ba ("rcu: Add support for consolidated-RCU reader checking")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Because list_for_each_entry_rcu() can now check for holding a
lock as well as for being in an RCU read-side critical section,
this commit replaces the workqueue_sysfs_unregister() function's
use of assert_rcu_or_wq_mutex() and list_for_each_entry_rcu() with
list_for_each_entry_rcu() augmented with a lockdep_is_held() optional
argument.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
008847f66c ("workqueue: allow rescuer thread to do more work.") made
the rescuer worker requeue the pwq immediately if there may be more
work items which need rescuing instead of waiting for the next mayday
timer expiration. Unfortunately, it doesn't check whether the pwq is
already on the mayday list and unconditionally gets the ref and moves
it onto the list. This doesn't corrupt the list but creates an
additional reference to the pwq. It got queued twice but will only be
removed once.
This leak later can trigger pwq refcnt warning on workqueue
destruction and prevent freeing of the workqueue.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "Williams, Gerald S" <gerald.s.williams@intel.com>
Cc: NeilBrown <neilb@suse.de>
Cc: stable@vger.kernel.org # v3.19+
destroy_workqueue() warnings still, at a lower frequency, trigger
spuriously. The problem seems to be in-flight operations which
haven't reached put_pwq() yet.
* Make sanity check grab all the related locks so that it's
synchronized against operations which puts pwq at the end.
* Always print out the offending pwq.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "Williams, Gerald S" <gerald.s.williams@intel.com>
* Now that wq->rescuer may be cleared while rescuer is still there,
switch show_pwq() debug printout to test worker->rescue_wq to
identify rescuers intead of testing wq->rescuer.
* Update comment on ->rescuer locking.
Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Lai Jiangshan <jiangshanlai@gmail.com>
Before actually destrying a workqueue, destroy_workqueue() checks
whether it's actually idle. If it isn't, it prints out a bunch of
warning messages and leaves the workqueue dangling. It unfortunately
has a couple issues.
* Mayday list queueing increments pwq's refcnts which gets detected as
busy and fails the sanity checks. However, because mayday list
queueing is asynchronous, this condition can happen without any
actual work items left in the workqueue.
* Sanity check failure leaves the sysfs interface behind too which can
lead to init failure of newer instances of the workqueue.
This patch fixes the above two by
* If a workqueue has a rescuer, disable and kill the rescuer before
sanity checks. Disabling and killing is guaranteed to flush the
existing mayday list.
* Remove sysfs interface before sanity checks.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Marcin Pawlowski <mpawlowski@fb.com>
Reported-by: "Williams, Gerald S" <gerald.s.williams@intel.com>
Cc: stable@vger.kernel.org
All callers use GFP_KERNEL. No point in having that argument.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
None of those functions have any users outside of workqueue.c. Confine
them.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>