Merge 0746c4a9f3 ("Merge branch 'i2c/for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux") into android-mainline

Steps on the way to 5.10-rc1

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iec426c6de4a59a517e5fa575a9424b883d958f08
This commit is contained in:
Greg Kroah-Hartman
2020-10-28 21:19:30 +01:00
144 changed files with 2274 additions and 1036 deletions

View File

@@ -35,3 +35,12 @@ Description:
controls the duration in milliseconds that blkback will not
cache any page not backed by a grant mapping.
The default is 10ms.
What: /sys/module/xen_blkback/parameters/feature_persistent
Date: September 2020
KernelVersion: 5.10
Contact: SeongJae Park <sjpark@amazon.de>
Description:
Whether to enable the persistent grants feature or not. Note
that this option only takes effect on newly created backends.
The default is Y (enable).

View File

@@ -1,4 +1,4 @@
What: /sys/module/xen_blkfront/parameters/max
What: /sys/module/xen_blkfront/parameters/max_indirect_segments
Date: June 2013
KernelVersion: 3.11
Contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
@@ -8,3 +8,12 @@ Description:
is 32 - higher value means more potential throughput but more
memory usage. The backend picks the minimum of the frontend
and its default backend value.
What: /sys/module/xen_blkfront/parameters/feature_persistent
Date: September 2020
KernelVersion: 5.10
Contact: SeongJae Park <sjpark@amazon.de>
Description:
Whether to enable the persistent grants feature or not. Note
that this option only takes effect on newly created frontends.
The default is Y (enable).

View File

@@ -3,9 +3,9 @@ SafeSetID
=========
SafeSetID is an LSM module that gates the setid family of syscalls to restrict
UID/GID transitions from a given UID/GID to only those approved by a
system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs
system-wide allowlist. These restrictions also prohibit the given UIDs/GIDs
from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
allowing a user to set up user namespace UID mappings.
allowing a user to set up user namespace UID/GID mappings.
Background
@@ -98,10 +98,21 @@ Directions for use
==================
This LSM hooks the setid syscalls to make sure transitions are allowed if an
applicable restriction policy is in place. Policies are configured through
securityfs by writing to the safesetid/add_whitelist_policy and
safesetid/flush_whitelist_policies files at the location where securityfs is
mounted. The format for adding a policy is '<UID>:<UID>', using literal
numbers, such as '123:456'. To flush the policies, any write to the file is
sufficient. Again, configuring a policy for a UID will prevent that UID from
obtaining auxiliary setid privileges, such as allowing a user to set up user
namespace UID mappings.
securityfs by writing to the safesetid/uid_allowlist_policy and
safesetid/gid_allowlist_policy files at the location where securityfs is
mounted. The format for adding a policy is '<UID>:<UID>' or '<GID>:<GID>',
using literal numbers, and ending with a newline character such as '123:456\n'.
Writing an empty string "" will flush the policy. Again, configuring a policy
for a UID/GID will prevent that UID/GID from obtaining auxiliary setid
privileges, such as allowing a user to set up user namespace UID/GID mappings.
Note on GID policies and setgroups()
==================
In v5.9 we are adding support for limiting CAP_SETGID privileges as was done
previously for CAP_SETUID. However, for compatibility with common sandboxing
related code conventions in userspace, we currently allow arbitrary
setgroups() calls for processes with CAP_SETGID restrictions. Until we add
support in a future release for restricting setgroups() calls, these GID
policies add no meaningful security. setgroups() restrictions will be enforced
once we have the policy checking code in place, which will rely on GID policy
configuration code added in v5.9.

View File

@@ -5982,6 +5982,13 @@
After which time (jiffies) the event handling loop
should start to delay EOI handling. Default is 2.
xen.fifo_events= [XEN]
Boolean parameter to disable using fifo event handling
even if available. Normally fifo event handling is
preferred over the 2-level event handling, as it is
fairer and the number of possible event channels is
much higher. Default is on (use fifo events).
nopv= [X86,XEN,KVM,HYPER_V,VMWARE]
Disables the PV optimizations forcing the guest to run
as generic guest with no PV drivers. Currently support

View File

@@ -124,6 +124,10 @@ For zoned block devices (zoned attribute indicating "host-managed" or
EXPLICIT OPEN, IMPLICIT OPEN or CLOSED, is limited by this value.
If this value is 0, there is no limit.
If the host attempts to exceed this limit, the driver should report this error
with BLK_STS_ZONE_ACTIVE_RESOURCE, which user space may see as the EOVERFLOW
errno.
max_open_zones (RO)
-------------------
For zoned block devices (zoned attribute indicating "host-managed" or
@@ -131,6 +135,10 @@ For zoned block devices (zoned attribute indicating "host-managed" or
EXPLICIT OPEN or IMPLICIT OPEN, is limited by this value.
If this value is 0, there is no limit.
If the host attempts to exceed this limit, the driver should report this error
with BLK_STS_ZONE_OPEN_RESOURCE, which user space may see as the ETOOMANYREFS
errno.
max_sectors_kb (RW)
-------------------
This is the maximum number of kilobytes that the block layer will allow

View File

@@ -519,10 +519,9 @@ routines, e.g.:::
Part II - Non-coherent DMA allocations
--------------------------------------
These APIs allow to allocate pages in the kernel direct mapping that are
guaranteed to be DMA addressable. This means that unlike dma_alloc_coherent,
virt_to_page can be called on the resulting address, and the resulting
struct page can be used for everything a struct page is suitable for.
These APIs allow to allocate pages that are guaranteed to be DMA addressable
by the passed in device, but which need explicit management of memory ownership
for the kernel vs the device.
If you don't understand how cache line coherency works between a processor and
an I/O device, you should not be using this part of the API.
@@ -537,7 +536,7 @@ an I/O device, you should not be using this part of the API.
This routine allocates a region of <size> bytes of consistent memory. It
returns a pointer to the allocated region (in the processor's virtual address
space) or NULL if the allocation failed. The returned memory may or may not
be in the kernels direct mapping. Drivers must not call virt_to_page on
be in the kernel direct mapping. Drivers must not call virt_to_page on
the returned memory region.
It also returns a <dma_handle> which may be cast to an unsigned integer the
@@ -565,7 +564,45 @@ reused.
Free a region of memory previously allocated using dma_alloc_noncoherent().
dev, size and dma_handle and dir must all be the same as those passed into
dma_alloc_noncoherent(). cpu_addr must be the virtual address returned by
the dma_alloc_noncoherent().
dma_alloc_noncoherent().
::
struct page *
dma_alloc_pages(struct device *dev, size_t size, dma_addr_t *dma_handle,
enum dma_data_direction dir, gfp_t gfp)
This routine allocates a region of <size> bytes of non-coherent memory. It
returns a pointer to first struct page for the region, or NULL if the
allocation failed. The resulting struct page can be used for everything a
struct page is suitable for.
It also returns a <dma_handle> which may be cast to an unsigned integer the
same width as the bus and given to the device as the DMA address base of
the region.
The dir parameter specified if data is read and/or written by the device,
see dma_map_single() for details.
The gfp parameter allows the caller to specify the ``GFP_`` flags (see
kmalloc()) for the allocation, but rejects flags used to specify a memory
zone such as GFP_DMA or GFP_HIGHMEM.
Before giving the memory to the device, dma_sync_single_for_device() needs
to be called, and before reading memory written by the device,
dma_sync_single_for_cpu(), just like for streaming DMA mappings that are
reused.
::
void
dma_free_pages(struct device *dev, size_t size, struct page *page,
dma_addr_t dma_handle, enum dma_data_direction dir)
Free a region of memory previously allocated using dma_alloc_pages().
dev, size and dma_handle and dir must all be the same as those passed into
dma_alloc_noncoherent(). page must be the pointer returned by
dma_alloc_pages().
::

View File

@@ -22,7 +22,7 @@
#include <linux/platform_device.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/dma-mapping.h>
#include <linux/dma-map-ops.h>
#include <linux/clk.h>
#include <linux/io.h>

View File

@@ -6,7 +6,7 @@
* Copyright (C) 1999-2003 Matthew Wilcox <willy at parisc-linux.org>
* Copyright (C) 2000-2003 Paul Bame <bame at parisc-linux.org>
* Copyright (C) 2001 Thomas Bogendoerfer <tsbogend at parisc-linux.org>
* Copyright (C) 1999-2014 Helge Deller <deller@gmx.de>
* Copyright (C) 1999-2020 Helge Deller <deller@gmx.de>
*/
#include <linux/uaccess.h>
@@ -23,6 +23,7 @@
#include <linux/utsname.h>
#include <linux/personality.h>
#include <linux/random.h>
#include <linux/compat.h>
/* we construct an artificial offset for the mapping based on the physical
* address of the kernel mapping variable */
@@ -373,3 +374,73 @@ long parisc_personality(unsigned long personality)
return err;
}
/*
* Up to kernel v5.9 we defined O_NONBLOCK as 000200004,
* since then O_NONBLOCK is defined as 000200000.
*
* The following wrapper functions mask out the old
* O_NDELAY bit from calls which use O_NONBLOCK.
*
* XXX: Remove those in year 2022 (or later)?
*/
#define O_NONBLOCK_OLD 000200004
#define O_NONBLOCK_MASK_OUT (O_NONBLOCK_OLD & ~O_NONBLOCK)
static int FIX_O_NONBLOCK(int flags)
{
if (flags & O_NONBLOCK_MASK_OUT) {
struct task_struct *tsk = current;
pr_warn_once("%s(%d) uses a deprecated O_NONBLOCK value.\n",
tsk->comm, tsk->pid);
}
return flags & ~O_NONBLOCK_MASK_OUT;
}
asmlinkage long parisc_timerfd_create(int clockid, int flags)
{
flags = FIX_O_NONBLOCK(flags);
return sys_timerfd_create(clockid, flags);
}
asmlinkage long parisc_signalfd4(int ufd, sigset_t __user *user_mask,
size_t sizemask, int flags)
{
flags = FIX_O_NONBLOCK(flags);
return sys_signalfd4(ufd, user_mask, sizemask, flags);
}
#ifdef CONFIG_COMPAT
asmlinkage long parisc_compat_signalfd4(int ufd,
compat_sigset_t __user *user_mask,
compat_size_t sizemask, int flags)
{
flags = FIX_O_NONBLOCK(flags);
return compat_sys_signalfd4(ufd, user_mask, sizemask, flags);
}
#endif
asmlinkage long parisc_eventfd2(unsigned int count, int flags)
{
flags = FIX_O_NONBLOCK(flags);
return sys_eventfd2(count, flags);
}
asmlinkage long parisc_userfaultfd(int flags)
{
flags = FIX_O_NONBLOCK(flags);
return sys_userfaultfd(flags);
}
asmlinkage long parisc_pipe2(int __user *fildes, int flags)
{
flags = FIX_O_NONBLOCK(flags);
return sys_pipe2(fildes, flags);
}
asmlinkage long parisc_inotify_init1(int flags)
{
flags = FIX_O_NONBLOCK(flags);
return sys_inotify_init1(flags);
}

View File

@@ -344,17 +344,17 @@
304 common eventfd sys_eventfd
305 32 fallocate parisc_fallocate
305 64 fallocate sys_fallocate
306 common timerfd_create sys_timerfd_create
306 common timerfd_create parisc_timerfd_create
307 32 timerfd_settime sys_timerfd_settime32
307 64 timerfd_settime sys_timerfd_settime
308 32 timerfd_gettime sys_timerfd_gettime32
308 64 timerfd_gettime sys_timerfd_gettime
309 common signalfd4 sys_signalfd4 compat_sys_signalfd4
310 common eventfd2 sys_eventfd2
309 common signalfd4 parisc_signalfd4 parisc_compat_signalfd4
310 common eventfd2 parisc_eventfd2
311 common epoll_create1 sys_epoll_create1
312 common dup3 sys_dup3
313 common pipe2 sys_pipe2
314 common inotify_init1 sys_inotify_init1
313 common pipe2 parisc_pipe2
314 common inotify_init1 parisc_inotify_init1
315 common preadv sys_preadv compat_sys_preadv
316 common pwritev sys_pwritev compat_sys_pwritev
317 common rt_tgsigqueueinfo sys_rt_tgsigqueueinfo compat_sys_rt_tgsigqueueinfo
@@ -387,7 +387,7 @@
341 common bpf sys_bpf
342 common execveat sys_execveat compat_sys_execveat
343 common membarrier sys_membarrier
344 common userfaultfd sys_userfaultfd
344 common userfaultfd parisc_userfaultfd
345 common mlock2 sys_mlock2
346 common copy_file_range sys_copy_file_range
347 common preadv2 sys_preadv2 compat_sys_preadv2

View File

@@ -180,9 +180,16 @@ static int rtc_generic_get_time(struct device *dev, struct rtc_time *tm)
static int rtc_generic_set_time(struct device *dev, struct rtc_time *tm)
{
time64_t secs = rtc_tm_to_time64(tm);
int ret;
if (pdc_tod_set(secs, 0) < 0)
/* hppa has Y2K38 problem: pdc_tod_set() takes an u32 value! */
ret = pdc_tod_set(secs, 0);
if (ret != 0) {
pr_warn("pdc_tod_set(%lld) returned error %d\n", secs, ret);
if (ret == PDC_INVALID_ARG)
return -EINVAL;
return -EOPNOTSUPP;
}
return 0;
}

View File

@@ -11,4 +11,17 @@
# define __ASM_CONST(x) x##UL
# define ASM_CONST(x) __ASM_CONST(x)
#endif
/*
* Inline assembly memory constraint
*
* GCC 4.9 doesn't properly handle pre update memory constraint "m<>"
*
*/
#if defined(GCC_VERSION) && GCC_VERSION < 50000
#define UPD_CONSTR ""
#else
#define UPD_CONSTR "<>"
#endif
#endif /* _ASM_POWERPC_ASM_CONST_H */

View File

@@ -477,7 +477,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
CPU_FTR_DAWR | CPU_FTR_DAWR1)
#define CPU_FTRS_CELL (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \

View File

@@ -182,7 +182,7 @@ do { \
"1: " op "%U1%X1 %0,%1 # put_user\n" \
EX_TABLE(1b, %l2) \
: \
: "r" (x), "m<>" (*addr) \
: "r" (x), "m"UPD_CONSTR (*addr) \
: \
: label)
@@ -253,7 +253,7 @@ extern long __get_user_bad(void);
".previous\n" \
EX_TABLE(1b, 3b) \
: "=r" (err), "=r" (x) \
: "m<>" (*addr), "i" (-EFAULT), "0" (err))
: "m"UPD_CONSTR (*addr), "i" (-EFAULT), "0" (err))
#ifdef __powerpc64__
#define __get_user_asm2(x, addr, err) \

View File

@@ -121,9 +121,16 @@ extern void __restore_cpu_e6500(void);
PPC_FEATURE2_DARN | \
PPC_FEATURE2_SCV)
#define COMMON_USER_POWER10 COMMON_USER_POWER9
#define COMMON_USER2_POWER10 (COMMON_USER2_POWER9 | \
PPC_FEATURE2_ARCH_3_1 | \
PPC_FEATURE2_MMA)
#define COMMON_USER2_POWER10 (PPC_FEATURE2_ARCH_3_1 | \
PPC_FEATURE2_MMA | \
PPC_FEATURE2_ARCH_3_00 | \
PPC_FEATURE2_HAS_IEEE128 | \
PPC_FEATURE2_DARN | \
PPC_FEATURE2_SCV | \
PPC_FEATURE2_ARCH_2_07 | \
PPC_FEATURE2_DSCR | \
PPC_FEATURE2_ISEL | PPC_FEATURE2_TAR | \
PPC_FEATURE2_VEC_CRYPTO)
#ifdef CONFIG_PPC_BOOK3E_64
#define COMMON_USER_BOOKE (COMMON_USER_PPC64 | PPC_FEATURE_BOOKE)

View File

@@ -466,11 +466,6 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
return 0;
}
if (!pe->addr) {
eeh_stats.no_cfg_addr++;
return 0;
}
/*
* On PowerNV platform, we might already have fenced PHB
* there and we need take care of that firstly.

View File

@@ -591,12 +591,11 @@ EXPORT_SYMBOL_GPL(machine_check_print_event_info);
long notrace machine_check_early(struct pt_regs *regs)
{
long handled = 0;
bool nested = in_nmi();
u8 ftrace_enabled = this_cpu_get_ftrace_enabled();
this_cpu_set_ftrace_enabled(0);
if (!nested)
/* Do not use nmi_enter/exit for pseries hpte guest */
if (radix_enabled() || !firmware_has_feature(FW_FEATURE_LPAR))
nmi_enter();
hv_nmi_check_nonrecoverable(regs);
@@ -607,7 +606,7 @@ long notrace machine_check_early(struct pt_regs *regs)
if (ppc_md.machine_check_early)
handled = ppc_md.machine_check_early(regs);
if (!nested)
if (radix_enabled() || !firmware_has_feature(FW_FEATURE_LPAR))
nmi_exit();
this_cpu_set_ftrace_enabled(ftrace_enabled);

View File

@@ -1240,43 +1240,33 @@ static struct device_node *cpu_to_l2cache(int cpu)
return cache;
}
static bool update_mask_by_l2(int cpu)
static bool update_mask_by_l2(int cpu, cpumask_var_t *mask)
{
struct cpumask *(*submask_fn)(int) = cpu_sibling_mask;
struct device_node *l2_cache, *np;
cpumask_var_t mask;
int i;
l2_cache = cpu_to_l2cache(cpu);
if (!l2_cache) {
struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
/*
* If no l2cache for this CPU, assume all siblings to share
* cache with this CPU.
*/
if (has_big_cores)
sibling_mask = cpu_smallcore_mask;
submask_fn = cpu_smallcore_mask;
for_each_cpu(i, sibling_mask(cpu))
l2_cache = cpu_to_l2cache(cpu);
if (!l2_cache || !*mask) {
/* Assume only core siblings share cache with this CPU */
for_each_cpu(i, submask_fn(cpu))
set_cpus_related(cpu, i, cpu_l2_cache_mask);
return false;
}
alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
if (has_big_cores)
submask_fn = cpu_smallcore_mask;
cpumask_and(*mask, cpu_online_mask, cpu_cpu_mask(cpu));
/* Update l2-cache mask with all the CPUs that are part of submask */
or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask);
/* Skip all CPUs already part of current CPU l2-cache mask */
cpumask_andnot(mask, mask, cpu_l2_cache_mask(cpu));
cpumask_andnot(*mask, *mask, cpu_l2_cache_mask(cpu));
for_each_cpu(i, mask) {
for_each_cpu(i, *mask) {
/*
* when updating the marks the current CPU has not been marked
* online, but we need to update the cache masks
@@ -1286,15 +1276,14 @@ static bool update_mask_by_l2(int cpu)
/* Skip all CPUs already part of current CPU l2-cache */
if (np == l2_cache) {
or_cpumasks_related(cpu, i, submask_fn, cpu_l2_cache_mask);
cpumask_andnot(mask, mask, submask_fn(i));
cpumask_andnot(*mask, *mask, submask_fn(i));
} else {
cpumask_andnot(mask, mask, cpu_l2_cache_mask(i));
cpumask_andnot(*mask, *mask, cpu_l2_cache_mask(i));
}
of_node_put(np);
}
of_node_put(l2_cache);
free_cpumask_var(mask);
return true;
}
@@ -1337,40 +1326,46 @@ static inline void add_cpu_to_smallcore_masks(int cpu)
}
}
static void update_coregroup_mask(int cpu)
static void update_coregroup_mask(int cpu, cpumask_var_t *mask)
{
struct cpumask *(*submask_fn)(int) = cpu_sibling_mask;
cpumask_var_t mask;
int coregroup_id = cpu_to_coregroup_id(cpu);
int i;
alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
if (shared_caches)
submask_fn = cpu_l2_cache_mask;
if (!*mask) {
/* Assume only siblings are part of this CPU's coregroup */
for_each_cpu(i, submask_fn(cpu))
set_cpus_related(cpu, i, cpu_coregroup_mask);
return;
}
cpumask_and(*mask, cpu_online_mask, cpu_cpu_mask(cpu));
/* Update coregroup mask with all the CPUs that are part of submask */
or_cpumasks_related(cpu, cpu, submask_fn, cpu_coregroup_mask);
/* Skip all CPUs already part of coregroup mask */
cpumask_andnot(mask, mask, cpu_coregroup_mask(cpu));
cpumask_andnot(*mask, *mask, cpu_coregroup_mask(cpu));
for_each_cpu(i, mask) {
for_each_cpu(i, *mask) {
/* Skip all CPUs not part of this coregroup */
if (coregroup_id == cpu_to_coregroup_id(i)) {
or_cpumasks_related(cpu, i, submask_fn, cpu_coregroup_mask);
cpumask_andnot(mask, mask, submask_fn(i));
cpumask_andnot(*mask, *mask, submask_fn(i));
} else {
cpumask_andnot(mask, mask, cpu_coregroup_mask(i));
cpumask_andnot(*mask, *mask, cpu_coregroup_mask(i));
}
}
free_cpumask_var(mask);
}
static void add_cpu_to_masks(int cpu)
{
int first_thread = cpu_first_thread_sibling(cpu);
cpumask_var_t mask;
int i;
/*
@@ -1384,10 +1379,15 @@ static void add_cpu_to_masks(int cpu)
set_cpus_related(i, cpu, cpu_sibling_mask);
add_cpu_to_smallcore_masks(cpu);
update_mask_by_l2(cpu);
/* In CPU-hotplug path, hence use GFP_ATOMIC */
alloc_cpumask_var_node(&mask, GFP_ATOMIC, cpu_to_node(cpu));
update_mask_by_l2(cpu, &mask);
if (has_coregroup_support())
update_coregroup_mask(cpu);
update_coregroup_mask(cpu, &mask);
free_cpumask_var(mask);
}
/* Activate a secondary processor. */

View File

@@ -885,7 +885,7 @@ static void p9_hmi_special_emu(struct pt_regs *regs)
{
unsigned int ra, rb, t, i, sel, instr, rc;
const void __user *addr;
u8 vbuf[16], *vdst;
u8 vbuf[16] __aligned(16), *vdst;
unsigned long ea, msr, msr_mask;
bool swap;

View File

@@ -88,9 +88,14 @@ static ssize_t dump_ack_store(struct dump_obj *dump_obj,
const char *buf,
size_t count)
{
/*
* Try to self remove this attribute. If we are successful,
* delete the kobject itself.
*/
if (sysfs_remove_file_self(&dump_obj->kobj, &attr->attr)) {
dump_send_ack(dump_obj->id);
sysfs_remove_file_self(&dump_obj->kobj, &attr->attr);
kobject_put(&dump_obj->kobj);
}
return count;
}
@@ -318,15 +323,14 @@ static ssize_t dump_attr_read(struct file *filep, struct kobject *kobj,
return count;
}
static struct dump_obj *create_dump_obj(uint32_t id, size_t size,
uint32_t type)
static void create_dump_obj(uint32_t id, size_t size, uint32_t type)
{
struct dump_obj *dump;
int rc;
dump = kzalloc(sizeof(*dump), GFP_KERNEL);
if (!dump)
return NULL;
return;
dump->kobj.kset = dump_kset;
@@ -346,21 +350,39 @@ static struct dump_obj *create_dump_obj(uint32_t id, size_t size,
rc = kobject_add(&dump->kobj, NULL, "0x%x-0x%x", type, id);
if (rc) {
kobject_put(&dump->kobj);
return NULL;
return;
}
/*
* As soon as the sysfs file for this dump is created/activated there is
* a chance the opal_errd daemon (or any userspace) might read and
* acknowledge the dump before kobject_uevent() is called. If that
* happens then there is a potential race between
* dump_ack_store->kobject_put() and kobject_uevent() which leads to a
* use-after-free of a kernfs object resulting in a kernel crash.
*
* To avoid that, we need to take a reference on behalf of the bin file,
* so that our reference remains valid while we call kobject_uevent().
* We then drop our reference before exiting the function, leaving the
* bin file to drop the last reference (if it hasn't already).
*/
/* Take a reference for the bin file */
kobject_get(&dump->kobj);
rc = sysfs_create_bin_file(&dump->kobj, &dump->dump_attr);
if (rc) {
kobject_put(&dump->kobj);
return NULL;
}
if (rc == 0) {
kobject_uevent(&dump->kobj, KOBJ_ADD);
pr_info("%s: New platform dump. ID = 0x%x Size %u\n",
__func__, dump->id, dump->size);
} else {
/* Drop reference count taken for bin file */
kobject_put(&dump->kobj);
}
kobject_uevent(&dump->kobj, KOBJ_ADD);
return dump;
/* Drop our reference */
kobject_put(&dump->kobj);
return;
}
static irqreturn_t process_dump(int irq, void *data)

View File

@@ -72,9 +72,14 @@ static ssize_t elog_ack_store(struct elog_obj *elog_obj,
const char *buf,
size_t count)
{
/*
* Try to self remove this attribute. If we are successful,
* delete the kobject itself.
*/
if (sysfs_remove_file_self(&elog_obj->kobj, &attr->attr)) {
opal_send_ack_elog(elog_obj->id);
sysfs_remove_file_self(&elog_obj->kobj, &attr->attr);
kobject_put(&elog_obj->kobj);
}
return count;
}

View File

@@ -521,18 +521,55 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
return 0; /* need to perform reset */
}
static int mce_handle_err_realmode(int disposition, u8 error_type)
{
#ifdef CONFIG_PPC_BOOK3S_64
if (disposition == RTAS_DISP_NOT_RECOVERED) {
switch (error_type) {
case MC_ERROR_TYPE_SLB:
case MC_ERROR_TYPE_ERAT:
/*
* Store the old slb content in paca before flushing.
* Print this when we go to virtual mode.
* There are chances that we may hit MCE again if there
* is a parity error on the SLB entry we trying to read
* for saving. Hence limit the slb saving to single
* level of recursion.
*/
if (local_paca->in_mce == 1)
slb_save_contents(local_paca->mce_faulty_slbs);
flush_and_reload_slb();
disposition = RTAS_DISP_FULLY_RECOVERED;
break;
default:
break;
}
} else if (disposition == RTAS_DISP_LIMITED_RECOVERY) {
/* Platform corrected itself but could be degraded */
pr_err("MCE: limited recovery, system may be degraded\n");
disposition = RTAS_DISP_FULLY_RECOVERED;
}
#endif
return disposition;
}
static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
static int mce_handle_err_virtmode(struct pt_regs *regs,
struct rtas_error_log *errp,
struct pseries_mc_errorlog *mce_log,
int disposition)
{
struct mce_error_info mce_err = { 0 };
unsigned long eaddr = 0, paddr = 0;
struct pseries_errorlog *pseries_log;
struct pseries_mc_errorlog *mce_log;
int disposition = rtas_error_disposition(errp);
int initiator = rtas_error_initiator(errp);
int severity = rtas_error_severity(errp);
unsigned long eaddr = 0, paddr = 0;
u8 error_type, err_sub_type;
if (!mce_log)
goto out;
error_type = mce_log->error_type;
err_sub_type = rtas_mc_error_sub_type(mce_log);
if (initiator == RTAS_INITIATOR_UNKNOWN)
mce_err.initiator = MCE_INITIATOR_UNKNOWN;
else if (initiator == RTAS_INITIATOR_CPU)
@@ -571,18 +608,7 @@ static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
mce_err.error_class = MCE_ECLASS_UNKNOWN;
if (!rtas_error_extended(errp))
goto out;
pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
if (pseries_log == NULL)
goto out;
mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
error_type = mce_log->error_type;
err_sub_type = rtas_mc_error_sub_type(mce_log);
switch (mce_log->error_type) {
switch (error_type) {
case MC_ERROR_TYPE_UE:
mce_err.error_type = MCE_ERROR_TYPE_UE;
mce_common_process_ue(regs, &mce_err);
@@ -682,37 +708,31 @@ static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
break;
}
#ifdef CONFIG_PPC_BOOK3S_64
if (disposition == RTAS_DISP_NOT_RECOVERED) {
switch (error_type) {
case MC_ERROR_TYPE_SLB:
case MC_ERROR_TYPE_ERAT:
/*
* Store the old slb content in paca before flushing.
* Print this when we go to virtual mode.
* There are chances that we may hit MCE again if there
* is a parity error on the SLB entry we trying to read
* for saving. Hence limit the slb saving to single
* level of recursion.
*/
if (local_paca->in_mce == 1)
slb_save_contents(local_paca->mce_faulty_slbs);
flush_and_reload_slb();
disposition = RTAS_DISP_FULLY_RECOVERED;
break;
default:
break;
}
} else if (disposition == RTAS_DISP_LIMITED_RECOVERY) {
/* Platform corrected itself but could be degraded */
printk(KERN_ERR "MCE: limited recovery, system may "
"be degraded\n");
disposition = RTAS_DISP_FULLY_RECOVERED;
}
#endif
out:
save_mce_event(regs, disposition == RTAS_DISP_FULLY_RECOVERED,
&mce_err, regs->nip, eaddr, paddr);
return disposition;
}
static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
{
struct pseries_errorlog *pseries_log;
struct pseries_mc_errorlog *mce_log = NULL;
int disposition = rtas_error_disposition(errp);
u8 error_type;
if (!rtas_error_extended(errp))
goto out;
pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
if (!pseries_log)
goto out;
mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
error_type = mce_log->error_type;
disposition = mce_handle_err_realmode(disposition, error_type);
/*
* Enable translation as we will be accessing per-cpu variables
* in save_mce_event() which may fall outside RMO region, also
@@ -723,10 +743,10 @@ out:
* Note: All the realmode handling like flushing SLB entries for
* SLB multihit is done by now.
*/
out:
mtmsr(mfmsr() | MSR_IR | MSR_DR);
save_mce_event(regs, disposition == RTAS_DISP_FULLY_RECOVERED,
&mce_err, regs->nip, eaddr, paddr);
disposition = mce_handle_err_virtmode(regs, errp, mce_log,
disposition);
return disposition;
}

View File

@@ -544,6 +544,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
pushq %rsi
call set_sev_encryption_mask
call load_stage2_idt
/* Pass boot_params to initialize_identity_maps() */
movq (%rsp), %rdi
call initialize_identity_maps
popq %rsi

View File

@@ -33,11 +33,11 @@
#define __PAGE_OFFSET __PAGE_OFFSET_BASE
#include "../../mm/ident_map.c"
#ifdef CONFIG_X86_5LEVEL
unsigned int __pgtable_l5_enabled;
unsigned int pgdir_shift = 39;
unsigned int ptrs_per_p4d = 1;
#endif
#define _SETUP
#include <asm/setup.h> /* For COMMAND_LINE_SIZE */
#undef _SETUP
extern unsigned long get_cmd_line_ptr(void);
/* Used by PAGE_KERN* macros: */
pteval_t __default_kernel_pte_mask __read_mostly = ~0;
@@ -107,8 +107,10 @@ static void add_identity_map(unsigned long start, unsigned long end)
}
/* Locates and clears a region for a new top level page table. */
void initialize_identity_maps(void)
void initialize_identity_maps(void *rmode)
{
unsigned long cmdline;
/* Exclude the encryption mask from __PHYSICAL_MASK */
physical_mask &= ~sme_me_mask;
@@ -149,10 +151,19 @@ void initialize_identity_maps(void)
}
/*
* New page-table is set up - map the kernel image and load it
* into cr3.
* New page-table is set up - map the kernel image, boot_params and the
* command line. The uncompressed kernel requires boot_params and the
* command line to be mapped in the identity mapping. Map them
* explicitly here in case the compressed kernel does not touch them,
* or does not touch all the pages covering them.
*/
add_identity_map((unsigned long)_head, (unsigned long)_end);
boot_params = rmode;
add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
cmdline = get_cmd_line_ptr();
add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
/* Load the new page-table. */
write_cr3(top_level_pgt);
}

View File

@@ -840,14 +840,6 @@ void choose_random_location(unsigned long input,
return;
}
#ifdef CONFIG_X86_5LEVEL
if (__read_cr4() & X86_CR4_LA57) {
__pgtable_l5_enabled = 1;
pgdir_shift = 48;
ptrs_per_p4d = 512;
}
#endif
boot_params->hdr.loadflags |= KASLR_FLAG;
if (IS_ENABLED(CONFIG_X86_32))

View File

@@ -8,6 +8,13 @@
#define BIOS_START_MIN 0x20000U /* 128K, less than this is insane */
#define BIOS_START_MAX 0x9f000U /* 640K, absolute maximum */
#ifdef CONFIG_X86_5LEVEL
/* __pgtable_l5_enabled needs to be in .data to avoid being cleared along with .bss */
unsigned int __section(.data) __pgtable_l5_enabled;
unsigned int __section(.data) pgdir_shift = 39;
unsigned int __section(.data) ptrs_per_p4d = 1;
#endif
struct paging_config {
unsigned long trampoline_start;
unsigned long l5_required;
@@ -198,4 +205,13 @@ void cleanup_trampoline(void *pgtable)
/* Restore trampoline memory */
memcpy(trampoline_32bit, trampoline_save, TRAMPOLINE_32BIT_SIZE);
/* Initialize variables for 5-level paging */
#ifdef CONFIG_X86_5LEVEL
if (__read_cr4() & X86_CR4_LA57) {
__pgtable_l5_enabled = 1;
pgdir_shift = 48;
ptrs_per_p4d = 512;
}
#endif
}

View File

@@ -47,6 +47,8 @@ endif
# non-deterministic coverage.
KCOV_INSTRUMENT := n
CFLAGS_head$(BITS).o += -fno-stack-protector
CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
obj-y := process_$(BITS).o signal.o

View File

@@ -197,12 +197,9 @@ static void ioapic_lazy_update_eoi(struct kvm_ioapic *ioapic, int irq)
/*
* If no longer has pending EOI in LAPICs, update
* EOI for this vetor.
* EOI for this vector.
*/
rtc_irq_eoi(ioapic, vcpu, entry->fields.vector);
kvm_ioapic_update_eoi_one(vcpu, ioapic,
entry->fields.trig_mode,
irq);
break;
}
}

View File

@@ -209,7 +209,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
WARN_ON(level > PT64_ROOT_MAX_LEVEL);
WARN_ON(level < PG_LEVEL_4K);
WARN_ON(gfn % KVM_PAGES_PER_HPAGE(level));
WARN_ON(gfn & (KVM_PAGES_PER_HPAGE(level) - 1));
/*
* If this warning were to trigger it would indicate that there was a

View File

@@ -222,7 +222,7 @@ void pi_wakeup_handler(void)
spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
}
void __init pi_init(int cpu)
void __init pi_init_cpu(int cpu)
{
INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));

View File

@@ -91,7 +91,7 @@ void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu);
int pi_pre_block(struct kvm_vcpu *vcpu);
void pi_post_block(struct kvm_vcpu *vcpu);
void pi_wakeup_handler(void);
void __init pi_init(int cpu);
void __init pi_init_cpu(int cpu);
bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu);
int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
bool set);

View File

@@ -8004,7 +8004,7 @@ static int __init vmx_init(void)
for_each_possible_cpu(cpu) {
INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
pi_init(cpu);
pi_init_cpu(cpu);
}
#ifdef CONFIG_KEXEC_CORE

View File

@@ -88,14 +88,17 @@ int xen_smp_intr_init(unsigned int cpu)
per_cpu(xen_callfunc_irq, cpu).irq = rc;
per_cpu(xen_callfunc_irq, cpu).name = callfunc_name;
if (!xen_fifo_events) {
debug_name = kasprintf(GFP_KERNEL, "debug%d", cpu);
rc = bind_virq_to_irqhandler(VIRQ_DEBUG, cpu, xen_debug_interrupt,
rc = bind_virq_to_irqhandler(VIRQ_DEBUG, cpu,
xen_debug_interrupt,
IRQF_PERCPU | IRQF_NOBALANCING,
debug_name, NULL);
if (rc < 0)
goto fail;
per_cpu(xen_debug_irq, cpu).irq = rc;
per_cpu(xen_debug_irq, cpu).name = debug_name;
}
callfunc_name = kasprintf(GFP_KERNEL, "callfuncsingle%d", cpu);
rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_SINGLE_VECTOR,

View File

@@ -29,6 +29,8 @@ extern struct start_info *xen_start_info;
extern struct shared_info xen_dummy_shared_info;
extern struct shared_info *HYPERVISOR_shared_info;
extern bool xen_fifo_events;
void xen_setup_mfn_list_list(void);
void xen_build_mfn_list_list(void);
void xen_setup_machphys_mapping(void);

View File

@@ -186,6 +186,10 @@ static const struct {
/* device mapper special case, should not leak out: */
[BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
/* zone device specific errors */
[BLK_STS_ZONE_OPEN_RESOURCE] = { -ETOOMANYREFS, "open zones exceeded" },
[BLK_STS_ZONE_ACTIVE_RESOURCE] = { -EOVERFLOW, "active zones exceeded" },
/* everything else not covered above: */
[BLK_STS_IOERR] = { -EIO, "I/O" },
};

View File

@@ -89,7 +89,7 @@ int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int index)
for_each_possible_cpu(i) {
if (index == qmap->mq_map[i])
return local_memory_node(cpu_to_node(i));
return cpu_to_node(i);
}
return NUMA_NO_NODE;

View File

@@ -1664,7 +1664,7 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
EXPORT_SYMBOL(blk_mq_run_hw_queue);
/**
* blk_mq_run_hw_queue - Run all hardware queues in a request queue.
* blk_mq_run_hw_queues - Run all hardware queues in a request queue.
* @q: Pointer to the request queue to run.
* @async: If we want to run the queue asynchronously.
*/
@@ -2743,7 +2743,7 @@ static void blk_mq_init_cpu_queues(struct request_queue *q,
for (j = 0; j < set->nr_maps; j++) {
hctx = blk_mq_map_queue_type(q, j, i);
if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
hctx->numa_node = local_memory_node(cpu_to_node(i));
hctx->numa_node = cpu_to_node(i);
}
}
}

View File

@@ -5616,7 +5616,7 @@ int ata_host_start(struct ata_host *host)
EXPORT_SYMBOL_GPL(ata_host_start);
/**
* ata_sas_host_init - Initialize a host struct for sas (ipr, libsas)
* ata_host_init - Initialize a host struct for sas (ipr, libsas)
* @host: host to initialize
* @dev: device host is attached to
* @ops: port_ops

View File

@@ -1115,7 +1115,7 @@ void ata_eh_freeze_port(struct ata_port *ap)
EXPORT_SYMBOL_GPL(ata_eh_freeze_port);
/**
* ata_port_thaw_port - EH helper to thaw port
* ata_eh_thaw_port - EH helper to thaw port
* @ap: ATA port to thaw
*
* Thaw frozen port @ap.

View File

@@ -1003,7 +1003,7 @@ void ata_scsi_sdev_config(struct scsi_device *sdev)
}
/**
* atapi_drain_needed - Check whether data transfer may overflow
* ata_scsi_dma_need_drain - Check whether data transfer may overflow
* @rq: request to be checked
*
* ATAPI commands which transfer variable length data to host

View File

@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* pata_ns87415.c - NS87415 (non PARISC) PATA
* pata_ns87415.c - NS87415 (and PARISC SUPERIO 87560) PATA
*
* (C) 2005 Red Hat <alan@lxorguk.ukuu.org.uk>
*
@@ -16,7 +16,6 @@
* systems. This has its own special mountain of errata.
*
* TODO:
* Test PARISC SuperIO
* Get someone to test on SPARC
* Implement lazy pio/dma switching for better performance
* 8bit shared timing.

View File

@@ -120,7 +120,7 @@
/* Descriptor table word 0 bit (when DTA32M = 1) */
#define SATA_RCAR_DTEND BIT(0)
#define SATA_RCAR_DMA_BOUNDARY 0x1FFFFFFEUL
#define SATA_RCAR_DMA_BOUNDARY 0x1FFFFFFFUL
/* Gen2 Physical Layer Control Registers */
#define RCAR_GEN2_PHY_CTL1_REG 0x1704

View File

@@ -802,9 +802,9 @@ static void recv_work(struct work_struct *work)
if (likely(!blk_should_fake_timeout(rq->q)))
blk_mq_complete_request(rq);
}
nbd_config_put(nbd);
atomic_dec(&config->recv_threads);
wake_up(&config->recv_wq);
nbd_config_put(nbd);
kfree(args);
}

View File

@@ -220,29 +220,34 @@ static void null_close_first_imp_zone(struct nullb_device *dev)
}
}
static bool null_can_set_active(struct nullb_device *dev)
static blk_status_t null_check_active(struct nullb_device *dev)
{
if (!dev->zone_max_active)
return true;
return BLK_STS_OK;
return dev->nr_zones_exp_open + dev->nr_zones_imp_open +
dev->nr_zones_closed < dev->zone_max_active;
if (dev->nr_zones_exp_open + dev->nr_zones_imp_open +
dev->nr_zones_closed < dev->zone_max_active)
return BLK_STS_OK;
return BLK_STS_ZONE_ACTIVE_RESOURCE;
}
static bool null_can_open(struct nullb_device *dev)
static blk_status_t null_check_open(struct nullb_device *dev)
{
if (!dev->zone_max_open)
return true;
return BLK_STS_OK;
if (dev->nr_zones_exp_open + dev->nr_zones_imp_open < dev->zone_max_open)
return true;
return BLK_STS_OK;
if (dev->nr_zones_imp_open && null_can_set_active(dev)) {
if (dev->nr_zones_imp_open) {
if (null_check_active(dev) == BLK_STS_OK) {
null_close_first_imp_zone(dev);
return true;
return BLK_STS_OK;
}
}
return false;
return BLK_STS_ZONE_OPEN_RESOURCE;
}
/*
@@ -258,19 +263,22 @@ static bool null_can_open(struct nullb_device *dev)
* it is not certain that closing an implicit open zone will allow a new zone
* to be opened, since we might already be at the active limit capacity.
*/
static bool null_has_zone_resources(struct nullb_device *dev, struct blk_zone *zone)
static blk_status_t null_check_zone_resources(struct nullb_device *dev, struct blk_zone *zone)
{
blk_status_t ret;
switch (zone->cond) {
case BLK_ZONE_COND_EMPTY:
if (!null_can_set_active(dev))
return false;
ret = null_check_active(dev);
if (ret != BLK_STS_OK)
return ret;
fallthrough;
case BLK_ZONE_COND_CLOSED:
return null_can_open(dev);
return null_check_open(dev);
default:
/* Should never be called for other states */
WARN_ON(1);
return false;
return BLK_STS_IOERR;
}
}
@@ -293,8 +301,9 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector,
return BLK_STS_IOERR;
case BLK_ZONE_COND_EMPTY:
case BLK_ZONE_COND_CLOSED:
if (!null_has_zone_resources(dev, zone))
return BLK_STS_IOERR;
ret = null_check_zone_resources(dev, zone);
if (ret != BLK_STS_OK)
return ret;
break;
case BLK_ZONE_COND_IMP_OPEN:
case BLK_ZONE_COND_EXP_OPEN:
@@ -349,6 +358,8 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector,
static blk_status_t null_open_zone(struct nullb_device *dev, struct blk_zone *zone)
{
blk_status_t ret;
if (zone->type == BLK_ZONE_TYPE_CONVENTIONAL)
return BLK_STS_IOERR;
@@ -357,15 +368,17 @@ static blk_status_t null_open_zone(struct nullb_device *dev, struct blk_zone *zo
/* open operation on exp open is not an error */
return BLK_STS_OK;
case BLK_ZONE_COND_EMPTY:
if (!null_has_zone_resources(dev, zone))
return BLK_STS_IOERR;
ret = null_check_zone_resources(dev, zone);
if (ret != BLK_STS_OK)
return ret;
break;
case BLK_ZONE_COND_IMP_OPEN:
dev->nr_zones_imp_open--;
break;
case BLK_ZONE_COND_CLOSED:
if (!null_has_zone_resources(dev, zone))
return BLK_STS_IOERR;
ret = null_check_zone_resources(dev, zone);
if (ret != BLK_STS_OK)
return ret;
dev->nr_zones_closed--;
break;
case BLK_ZONE_COND_FULL:
@@ -381,6 +394,8 @@ static blk_status_t null_open_zone(struct nullb_device *dev, struct blk_zone *zo
static blk_status_t null_finish_zone(struct nullb_device *dev, struct blk_zone *zone)
{
blk_status_t ret;
if (zone->type == BLK_ZONE_TYPE_CONVENTIONAL)
return BLK_STS_IOERR;
@@ -389,8 +404,9 @@ static blk_status_t null_finish_zone(struct nullb_device *dev, struct blk_zone *
/* finish operation on full is not an error */
return BLK_STS_OK;
case BLK_ZONE_COND_EMPTY:
if (!null_has_zone_resources(dev, zone))
return BLK_STS_IOERR;
ret = null_check_zone_resources(dev, zone);
if (ret != BLK_STS_OK)
return ret;
break;
case BLK_ZONE_COND_IMP_OPEN:
dev->nr_zones_imp_open--;
@@ -399,8 +415,9 @@ static blk_status_t null_finish_zone(struct nullb_device *dev, struct blk_zone *
dev->nr_zones_exp_open--;
break;
case BLK_ZONE_COND_CLOSED:
if (!null_has_zone_resources(dev, zone))
return BLK_STS_IOERR;
ret = null_check_zone_resources(dev, zone);
if (ret != BLK_STS_OK)
return ret;
dev->nr_zones_closed--;
break;
default:

View File

@@ -91,11 +91,6 @@ static int rnbd_clt_set_dev_attr(struct rnbd_clt_dev *dev,
dev->max_hw_sectors = sess->max_io_size / SECTOR_SIZE;
dev->max_segments = BMAX_SEGMENTS;
dev->max_hw_sectors = min_t(u32, dev->max_hw_sectors,
le32_to_cpu(rsp->max_hw_sectors));
dev->max_segments = min_t(u16, dev->max_segments,
le16_to_cpu(rsp->max_segments));
return 0;
}
@@ -427,7 +422,7 @@ enum wait_type {
};
static int send_usr_msg(struct rtrs_clt *rtrs, int dir,
struct rnbd_iu *iu, struct kvec *vec, size_t nr,
struct rnbd_iu *iu, struct kvec *vec,
size_t len, struct scatterlist *sg, unsigned int sg_len,
void (*conf)(struct work_struct *work),
int *errno, enum wait_type wait)
@@ -441,7 +436,7 @@ static int send_usr_msg(struct rtrs_clt *rtrs, int dir,
.conf_fn = msg_conf,
};
err = rtrs_clt_request(dir, &req_ops, rtrs, iu->permit,
vec, nr, len, sg, sg_len);
vec, 1, len, sg, sg_len);
if (!err && wait) {
wait_event(iu->comp.wait, iu->comp.errno != INT_MAX);
*errno = iu->comp.errno;
@@ -486,7 +481,7 @@ static int send_msg_close(struct rnbd_clt_dev *dev, u32 device_id, bool wait)
msg.device_id = cpu_to_le32(device_id);
WARN_ON(!rnbd_clt_get_dev(dev));
err = send_usr_msg(sess->rtrs, WRITE, iu, &vec, 1, 0, NULL, 0,
err = send_usr_msg(sess->rtrs, WRITE, iu, &vec, 0, NULL, 0,
msg_close_conf, &errno, wait);
if (err) {
rnbd_clt_put_dev(dev);
@@ -575,7 +570,7 @@ static int send_msg_open(struct rnbd_clt_dev *dev, bool wait)
WARN_ON(!rnbd_clt_get_dev(dev));
err = send_usr_msg(sess->rtrs, READ, iu,
&vec, 1, sizeof(*rsp), iu->sglist, 1,
&vec, sizeof(*rsp), iu->sglist, 1,
msg_open_conf, &errno, wait);
if (err) {
rnbd_clt_put_dev(dev);
@@ -629,7 +624,7 @@ static int send_msg_sess_info(struct rnbd_clt_session *sess, bool wait)
goto put_iu;
}
err = send_usr_msg(sess->rtrs, READ, iu,
&vec, 1, sizeof(*rsp), iu->sglist, 1,
&vec, sizeof(*rsp), iu->sglist, 1,
msg_sess_info_conf, &errno, wait);
if (err) {
rnbd_clt_put_sess(sess);
@@ -1514,7 +1509,7 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
"map_device: Failed to configure device, err: %d\n",
ret);
mutex_unlock(&dev->lock);
goto del_dev;
goto send_close;
}
rnbd_clt_info(dev,
@@ -1533,6 +1528,8 @@ struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
return dev;
send_close:
send_msg_close(dev, dev->device_id, WAIT);
del_dev:
delete_dev(dev);
put_dev:

View File

@@ -25,7 +25,6 @@
#include <linux/dma-mapping.h>
#include <linux/completion.h>
#include <linux/scatterlist.h>
#include <linux/version.h>
#include <linux/err.h>
#include <linux/aer.h>
#include <linux/wait.h>

View File

@@ -473,6 +473,12 @@ static void xen_vbd_free(struct xen_vbd *vbd)
vbd->bdev = NULL;
}
/* Enable the persistent grants feature. */
static bool feature_persistent = true;
module_param(feature_persistent, bool, 0644);
MODULE_PARM_DESC(feature_persistent,
"Enables the persistent grants feature");
static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
unsigned major, unsigned minor, int readonly,
int cdrom)
@@ -518,6 +524,8 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
if (q && blk_queue_secure_erase(q))
vbd->discard_secure = true;
vbd->feature_gnt_persistent = feature_persistent;
pr_debug("Successful creation of handle=%04x (dom=%u)\n",
handle, blkif->domid);
return 0;
@@ -905,7 +913,8 @@ again:
xen_blkbk_barrier(xbt, be, be->blkif->vbd.flush_support);
err = xenbus_printf(xbt, dev->nodename, "feature-persistent", "%u", 1);
err = xenbus_printf(xbt, dev->nodename, "feature-persistent", "%u",
be->blkif->vbd.feature_gnt_persistent);
if (err) {
xenbus_dev_fatal(dev, err, "writing %s/feature-persistent",
dev->nodename);
@@ -1066,7 +1075,6 @@ static int connect_ring(struct backend_info *be)
{
struct xenbus_device *dev = be->dev;
struct xen_blkif *blkif = be->blkif;
unsigned int pers_grants;
char protocol[64] = "";
int err, i;
char *xspath;
@@ -1092,9 +1100,11 @@ static int connect_ring(struct backend_info *be)
xenbus_dev_fatal(dev, err, "unknown fe protocol %s", protocol);
return -ENOSYS;
}
pers_grants = xenbus_read_unsigned(dev->otherend, "feature-persistent",
0);
blkif->vbd.feature_gnt_persistent = pers_grants;
if (blkif->vbd.feature_gnt_persistent)
blkif->vbd.feature_gnt_persistent =
xenbus_read_unsigned(dev->otherend,
"feature-persistent", 0);
blkif->vbd.overflow_max_grants = 0;
/*
@@ -1117,7 +1127,7 @@ static int connect_ring(struct backend_info *be)
pr_info("%s: using %d queues, protocol %d (%s) %s\n", dev->nodename,
blkif->nr_rings, blkif->blk_protocol, protocol,
pers_grants ? "persistent grants" : "");
blkif->vbd.feature_gnt_persistent ? "persistent grants" : "");
ring_page_order = xenbus_read_unsigned(dev->otherend,
"ring-page-order", 0);

View File

@@ -1866,8 +1866,8 @@ again:
message = "writing protocol";
goto abort_transaction;
}
err = xenbus_printf(xbt, dev->nodename,
"feature-persistent", "%u", 1);
err = xenbus_printf(xbt, dev->nodename, "feature-persistent", "%u",
info->feature_persistent);
if (err)
dev_warn(&dev->dev,
"writing persistent grants feature to xenbus");
@@ -1941,6 +1941,13 @@ static int negotiate_mq(struct blkfront_info *info)
}
return 0;
}
/* Enable the persistent grants feature. */
static bool feature_persistent = true;
module_param(feature_persistent, bool, 0644);
MODULE_PARM_DESC(feature_persistent,
"Enables the persistent grants feature");
/**
* Entry point to this code when a new device is created. Allocate the basic
* structures and the ring buffer for communication with the backend, and
@@ -2007,6 +2014,8 @@ static int blkfront_probe(struct xenbus_device *dev,
info->vdevice = vdevice;
info->connected = BLKIF_STATE_DISCONNECTED;
info->feature_persistent = feature_persistent;
/* Front end dir is a number, which is used as the id. */
info->handle = simple_strtoul(strrchr(dev->nodename, '/')+1, NULL, 0);
dev_set_drvdata(&dev->dev, info);
@@ -2316,6 +2325,7 @@ static void blkfront_gather_backend_features(struct blkfront_info *info)
if (xenbus_read_unsigned(info->xbdev->otherend, "feature-discard", 0))
blkfront_setup_discard(info);
if (info->feature_persistent)
info->feature_persistent =
!!xenbus_read_unsigned(info->xbdev->otherend,
"feature-persistent", 0);

View File

@@ -1218,10 +1218,11 @@ out:
static int __zram_bvec_read(struct zram *zram, struct page *page, u32 index,
struct bio *bio, bool partial_io)
{
int ret;
struct zcomp_strm *zstrm;
unsigned long handle;
unsigned int size;
void *src, *dst;
int ret;
zram_slot_lock(zram, index);
if (zram_test_flag(zram, index, ZRAM_WB)) {
@@ -1252,6 +1253,9 @@ static int __zram_bvec_read(struct zram *zram, struct page *page, u32 index,
size = zram_get_obj_size(zram, index);
if (size != PAGE_SIZE)
zstrm = zcomp_stream_get(zram->comp);
src = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
if (size == PAGE_SIZE) {
dst = kmap_atomic(page);
@@ -1259,8 +1263,6 @@ static int __zram_bvec_read(struct zram *zram, struct page *page, u32 index,
kunmap_atomic(dst);
ret = 0;
} else {
struct zcomp_strm *zstrm = zcomp_stream_get(zram->comp);
dst = kmap_atomic(page);
ret = zcomp_decompress(zstrm, src, size, dst);
kunmap_atomic(dst);

View File

@@ -1277,7 +1277,6 @@ void add_interrupt_randomness(int irq, int irq_flags)
fast_mix(fast_pool);
add_interrupt_bench(cycles);
this_cpu_add(net_rand_state.s1, fast_pool->pool[cycles & 3]);
if (unlikely(crng_init == 0)) {
if ((fast_pool->count >= 64) &&

View File

@@ -264,6 +264,7 @@ static acpi_status i2c_acpi_add_device(acpi_handle handle, u32 level,
void i2c_acpi_register_devices(struct i2c_adapter *adap)
{
acpi_status status;
acpi_handle handle;
if (!has_acpi_companion(&adap->dev))
return;
@@ -274,6 +275,15 @@ void i2c_acpi_register_devices(struct i2c_adapter *adap)
adap, NULL);
if (ACPI_FAILURE(status))
dev_warn(&adap->dev, "failed to enumerate I2C slaves\n");
if (!adap->dev.parent)
return;
handle = ACPI_HANDLE(adap->dev.parent);
if (!handle)
return;
acpi_walk_dep_device_list(handle);
}
static const struct acpi_device_id i2c_acpi_force_400khz_device_ids[] = {
@@ -719,7 +729,6 @@ int i2c_acpi_install_space_handler(struct i2c_adapter *adapter)
return -ENOMEM;
}
acpi_walk_dep_device_list(handle);
return 0;
}

View File

@@ -74,7 +74,7 @@ EXPORT_SYMBOL(hil_mlc_unregister);
static LIST_HEAD(hil_mlcs);
static DEFINE_RWLOCK(hil_mlcs_lock);
static struct timer_list hil_mlcs_kicker;
static int hil_mlcs_probe;
static int hil_mlcs_probe, hil_mlc_stop;
static void hil_mlcs_process(unsigned long unused);
static DECLARE_TASKLET_DISABLED_OLD(hil_mlcs_tasklet, hil_mlcs_process);
@@ -702,9 +702,13 @@ static int hilse_donode(hil_mlc *mlc)
if (!mlc->ostarted) {
mlc->ostarted = 1;
mlc->opacket = pack;
mlc->out(mlc);
rc = mlc->out(mlc);
nextidx = HILSEN_DOZE;
write_unlock_irqrestore(&mlc->lock, flags);
if (rc) {
hil_mlc_stop = 1;
return 1;
}
break;
}
mlc->ostarted = 0;
@@ -715,8 +719,13 @@ static int hilse_donode(hil_mlc *mlc)
case HILSE_CTS:
write_lock_irqsave(&mlc->lock, flags);
nextidx = mlc->cts(mlc) ? node->bad : node->good;
rc = mlc->cts(mlc);
nextidx = rc ? node->bad : node->good;
write_unlock_irqrestore(&mlc->lock, flags);
if (rc) {
hil_mlc_stop = 1;
return 1;
}
break;
default:
@@ -780,6 +789,12 @@ static void hil_mlcs_process(unsigned long unused)
static void hil_mlcs_timer(struct timer_list *unused)
{
if (hil_mlc_stop) {
/* could not send packet - stop immediately. */
pr_warn(PREFIX "HIL seems stuck - Disabling HIL MLC.\n");
return;
}
hil_mlcs_probe = 1;
tasklet_schedule(&hil_mlcs_tasklet);
/* Re-insert the periodic task. */

View File

@@ -210,7 +210,7 @@ static int hp_sdc_mlc_cts(hil_mlc *mlc)
priv->tseq[2] = 1;
priv->tseq[3] = 0;
priv->tseq[4] = 0;
__hp_sdc_enqueue_transaction(&priv->trans);
return __hp_sdc_enqueue_transaction(&priv->trans);
busy:
return 1;
done:
@@ -219,7 +219,7 @@ static int hp_sdc_mlc_cts(hil_mlc *mlc)
return 0;
}
static void hp_sdc_mlc_out(hil_mlc *mlc)
static int hp_sdc_mlc_out(hil_mlc *mlc)
{
struct hp_sdc_mlc_priv_s *priv;
@@ -234,7 +234,7 @@ static void hp_sdc_mlc_out(hil_mlc *mlc)
do_data:
if (priv->emtestmode) {
up(&mlc->osem);
return;
return 0;
}
/* Shouldn't be sending commands when loop may be busy */
BUG_ON(down_trylock(&mlc->csem));
@@ -296,7 +296,7 @@ static void hp_sdc_mlc_out(hil_mlc *mlc)
BUG_ON(down_trylock(&mlc->csem));
}
enqueue:
hp_sdc_enqueue_transaction(&priv->trans);
return hp_sdc_enqueue_transaction(&priv->trans);
}
static int __init hp_sdc_mlc_init(void)

View File

@@ -1311,8 +1311,9 @@ static long nvm_ioctl_get_devices(struct file *file, void __user *arg)
strlcpy(info->bmname, "gennvm", sizeof(info->bmname));
i++;
if (i > 31) {
pr_err("max 31 devices can be reported.\n");
if (i >= ARRAY_SIZE(devices->info)) {
pr_err("max %zd devices can be reported.\n",
ARRAY_SIZE(devices->info));
break;
}
}

View File

@@ -248,6 +248,10 @@ static blk_status_t nvme_error_status(u16 status)
return BLK_STS_NEXUS;
case NVME_SC_HOST_PATH_ERROR:
return BLK_STS_TRANSPORT;
case NVME_SC_ZONE_TOO_MANY_ACTIVE:
return BLK_STS_ZONE_ACTIVE_RESOURCE;
case NVME_SC_ZONE_TOO_MANY_OPEN:
return BLK_STS_ZONE_OPEN_RESOURCE;
default:
return BLK_STS_IOERR;
}

View File

@@ -26,6 +26,10 @@ enum nvme_fc_queue_flags {
};
#define NVME_FC_DEFAULT_DEV_LOSS_TMO 60 /* seconds */
#define NVME_FC_DEFAULT_RECONNECT_TMO 2 /* delay between reconnects
* when connected and a
* connection failure.
*/
struct nvme_fc_queue {
struct nvme_fc_ctrl *ctrl;
@@ -1837,8 +1841,10 @@ __nvme_fc_abort_op(struct nvme_fc_ctrl *ctrl, struct nvme_fc_fcp_op *op)
opstate = atomic_xchg(&op->state, FCPOP_STATE_ABORTED);
if (opstate != FCPOP_STATE_ACTIVE)
atomic_set(&op->state, opstate);
else if (test_bit(FCCTRL_TERMIO, &ctrl->flags))
else if (test_bit(FCCTRL_TERMIO, &ctrl->flags)) {
op->flags |= FCOP_FLAGS_TERMIO;
ctrl->iocnt++;
}
spin_unlock_irqrestore(&ctrl->lock, flags);
if (opstate != FCPOP_STATE_ACTIVE)
@@ -1874,7 +1880,8 @@ __nvme_fc_fcpop_chk_teardowns(struct nvme_fc_ctrl *ctrl,
if (opstate == FCPOP_STATE_ABORTED) {
spin_lock_irqsave(&ctrl->lock, flags);
if (test_bit(FCCTRL_TERMIO, &ctrl->flags)) {
if (test_bit(FCCTRL_TERMIO, &ctrl->flags) &&
op->flags & FCOP_FLAGS_TERMIO) {
if (!--ctrl->iocnt)
wake_up(&ctrl->ioabort_wait);
}
@@ -2314,7 +2321,7 @@ nvme_fc_create_hw_io_queues(struct nvme_fc_ctrl *ctrl, u16 qsize)
return 0;
delete_queues:
for (; i >= 0; i--)
for (; i > 0; i--)
__nvme_fc_delete_hw_queue(ctrl, &ctrl->queues[i], i);
return ret;
}
@@ -2433,7 +2440,7 @@ nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg)
return;
dev_warn(ctrl->ctrl.device,
"NVME-FC{%d}: transport association error detected: %s\n",
"NVME-FC{%d}: transport association event: %s\n",
ctrl->cnum, errmsg);
dev_warn(ctrl->ctrl.device,
"NVME-FC{%d}: resetting controller\n", ctrl->cnum);
@@ -2446,15 +2453,20 @@ nvme_fc_timeout(struct request *rq, bool reserved)
{
struct nvme_fc_fcp_op *op = blk_mq_rq_to_pdu(rq);
struct nvme_fc_ctrl *ctrl = op->ctrl;
struct nvme_fc_cmd_iu *cmdiu = &op->cmd_iu;
struct nvme_command *sqe = &cmdiu->sqe;
/*
* we can't individually ABTS an io without affecting the queue,
* thus killing the queue, and thus the association.
* So resolve by performing a controller reset, which will stop
* the host/io stack, terminate the association on the link,
* and recreate an association on the link.
* Attempt to abort the offending command. Command completion
* will detect the aborted io and will fail the connection.
*/
nvme_fc_error_recovery(ctrl, "io timeout error");
dev_info(ctrl->ctrl.device,
"NVME-FC{%d.%d}: io timeout: opcode %d fctype %d w10/11: "
"x%08x/x%08x\n",
ctrl->cnum, op->queue->qnum, sqe->common.opcode,
sqe->connect.fctype, sqe->common.cdw10, sqe->common.cdw11);
if (__nvme_fc_abort_op(ctrl, op))
nvme_fc_error_recovery(ctrl, "io timeout abort failed");
/*
* the io abort has been initiated. Have the reset timer
@@ -2726,6 +2738,7 @@ nvme_fc_complete_rq(struct request *rq)
struct nvme_fc_ctrl *ctrl = op->ctrl;
atomic_set(&op->state, FCPOP_STATE_IDLE);
op->flags &= ~FCOP_FLAGS_TERMIO;
nvme_fc_unmap_data(ctrl, rq, op);
nvme_complete_rq(rq);
@@ -2876,11 +2889,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
if (ret)
goto out_delete_hw_queues;
if (prior_ioq_cnt != nr_io_queues)
if (prior_ioq_cnt != nr_io_queues) {
dev_info(ctrl->ctrl.device,
"reconnect: revising io queue count from %d to %d\n",
prior_ioq_cnt, nr_io_queues);
nvme_wait_freeze(&ctrl->ctrl);
blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
nvme_unfreeze(&ctrl->ctrl);
}
return 0;
@@ -3090,6 +3106,61 @@ out_free_queue:
return ret;
}
/*
* This routine runs through all outstanding commands on the association
* and aborts them. This routine is typically be called by the
* delete_association routine. It is also called due to an error during
* reconnect. In that scenario, it is most likely a command that initializes
* the controller, including fabric Connect commands on io queues, that
* may have timed out or failed thus the io must be killed for the connect
* thread to see the error.
*/
static void
__nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
{
/*
* If io queues are present, stop them and terminate all outstanding
* ios on them. As FC allocates FC exchange for each io, the
* transport must contact the LLDD to terminate the exchange,
* thus releasing the FC exchange. We use blk_mq_tagset_busy_itr()
* to tell us what io's are busy and invoke a transport routine
* to kill them with the LLDD. After terminating the exchange
* the LLDD will call the transport's normal io done path, but it
* will have an aborted status. The done path will return the
* io requests back to the block layer as part of normal completions
* (but with error status).
*/
if (ctrl->ctrl.queue_count > 1) {
nvme_stop_queues(&ctrl->ctrl);
blk_mq_tagset_busy_iter(&ctrl->tag_set,
nvme_fc_terminate_exchange, &ctrl->ctrl);
blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
if (start_queues)
nvme_start_queues(&ctrl->ctrl);
}
/*
* Other transports, which don't have link-level contexts bound
* to sqe's, would try to gracefully shutdown the controller by
* writing the registers for shutdown and polling (call
* nvme_shutdown_ctrl()). Given a bunch of i/o was potentially
* just aborted and we will wait on those contexts, and given
* there was no indication of how live the controlelr is on the
* link, don't send more io to create more contexts for the
* shutdown. Let the controller fail via keepalive failure if
* its still present.
*/
/*
* clean up the admin queue. Same thing as above.
*/
blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
nvme_fc_terminate_exchange, &ctrl->ctrl);
blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
}
/*
* This routine stops operation of the controller on the host side.
* On the host os stack side: Admin and IO queues are stopped,
@@ -3110,46 +3181,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
ctrl->iocnt = 0;
spin_unlock_irqrestore(&ctrl->lock, flags);
/*
* If io queues are present, stop them and terminate all outstanding
* ios on them. As FC allocates FC exchange for each io, the
* transport must contact the LLDD to terminate the exchange,
* thus releasing the FC exchange. We use blk_mq_tagset_busy_itr()
* to tell us what io's are busy and invoke a transport routine
* to kill them with the LLDD. After terminating the exchange
* the LLDD will call the transport's normal io done path, but it
* will have an aborted status. The done path will return the
* io requests back to the block layer as part of normal completions
* (but with error status).
*/
if (ctrl->ctrl.queue_count > 1) {
nvme_stop_queues(&ctrl->ctrl);
blk_mq_tagset_busy_iter(&ctrl->tag_set,
nvme_fc_terminate_exchange, &ctrl->ctrl);
blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
}
/*
* Other transports, which don't have link-level contexts bound
* to sqe's, would try to gracefully shutdown the controller by
* writing the registers for shutdown and polling (call
* nvme_shutdown_ctrl()). Given a bunch of i/o was potentially
* just aborted and we will wait on those contexts, and given
* there was no indication of how live the controlelr is on the
* link, don't send more io to create more contexts for the
* shutdown. Let the controller fail via keepalive failure if
* its still present.
*/
/*
* clean up the admin queue. Same thing as above.
* use blk_mq_tagset_busy_itr() and the transport routine to
* terminate the exchanges.
*/
blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
nvme_fc_terminate_exchange, &ctrl->ctrl);
blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
__nvme_fc_abort_outstanding_ios(ctrl, false);
/* kill the aens as they are a separate path */
nvme_fc_abort_aen_ops(ctrl);
@@ -3263,21 +3295,26 @@ static void
__nvme_fc_terminate_io(struct nvme_fc_ctrl *ctrl)
{
/*
* if state is connecting - the error occurred as part of a
* reconnect attempt. The create_association error paths will
* clean up any outstanding io.
*
* if it's a different state - ensure all pending io is
* terminated. Given this can delay while waiting for the
* aborted io to return, we recheck adapter state below
* before changing state.
* if state is CONNECTING - the error occurred as part of a
* reconnect attempt. Abort any ios on the association and
* let the create_association error paths resolve things.
*/
if (ctrl->ctrl.state != NVME_CTRL_CONNECTING) {
if (ctrl->ctrl.state == NVME_CTRL_CONNECTING) {
__nvme_fc_abort_outstanding_ios(ctrl, true);
return;
}
/*
* For any other state, kill the association. As this routine
* is a common io abort routine for resetting and such, after
* the association is terminated, ensure that the state is set
* to CONNECTING.
*/
nvme_stop_keep_alive(&ctrl->ctrl);
/* will block will waiting for io to terminate */
nvme_fc_delete_association(ctrl);
}
if (ctrl->ctrl.state != NVME_CTRL_CONNECTING &&
!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING))
@@ -3403,7 +3440,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
{
struct nvme_fc_ctrl *ctrl;
unsigned long flags;
int ret, idx;
int ret, idx, ctrl_loss_tmo;
if (!(rport->remoteport.port_role &
(FC_PORT_ROLE_NVME_DISCOVERY | FC_PORT_ROLE_NVME_TARGET))) {
@@ -3429,6 +3466,19 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
goto out_free_ctrl;
}
/*
* if ctrl_loss_tmo is being enforced and the default reconnect delay
* is being used, change to a shorter reconnect delay for FC.
*/
if (opts->max_reconnects != -1 &&
opts->reconnect_delay == NVMF_DEF_RECONNECT_DELAY &&
opts->reconnect_delay > NVME_FC_DEFAULT_RECONNECT_TMO) {
ctrl_loss_tmo = opts->max_reconnects * opts->reconnect_delay;
opts->reconnect_delay = NVME_FC_DEFAULT_RECONNECT_TMO;
opts->max_reconnects = DIV_ROUND_UP(ctrl_loss_tmo,
opts->reconnect_delay);
}
ctrl->ctrl.opts = opts;
ctrl->ctrl.nr_reconnects = 0;
if (lport->dev)

View File

@@ -176,7 +176,7 @@ static inline struct nvme_request *nvme_req(struct request *req)
static inline u16 nvme_req_qid(struct request *req)
{
if (!req->rq_disk)
if (!req->q->queuedata)
return 0;
return blk_mq_unique_tag_to_hwq(blk_mq_unique_tag(req)) + 1;
}

View File

@@ -3185,6 +3185,8 @@ static const struct pci_device_id nvme_id_table[] = {
NVME_QUIRK_IGNORE_DEV_SUBNQN, },
{ PCI_DEVICE(0x1c5c, 0x1504), /* SK Hynix PC400 */
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
.driver_data = NVME_QUIRK_SINGLE_VECTOR },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },

View File

@@ -1730,10 +1730,11 @@ static void nvme_rdma_process_nvme_rsp(struct nvme_rdma_queue *queue,
req->result = cqe->result;
if (wc->wc_flags & IB_WC_WITH_INVALIDATE) {
if (unlikely(wc->ex.invalidate_rkey != req->mr->rkey)) {
if (unlikely(!req->mr ||
wc->ex.invalidate_rkey != req->mr->rkey)) {
dev_err(queue->ctrl->ctrl.device,
"Bogus remote invalidation for rkey %#x\n",
req->mr->rkey);
req->mr ? req->mr->rkey : 0);
nvme_rdma_error_recovery(queue->ctrl);
}
} else if (req->mr) {
@@ -1926,7 +1927,6 @@ static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
complete(&queue->cm_done);
return 0;
case RDMA_CM_EVENT_REJECTED:
nvme_rdma_destroy_queue_ib(queue);
cm_error = nvme_rdma_conn_rejected(queue, ev);
break;
case RDMA_CM_EVENT_ROUTE_ERROR:

View File

@@ -1126,6 +1126,7 @@ static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl)
* in case a host died before it enabled the controller. Hence, simply
* reset the keep alive timer when the controller is enabled.
*/
if (ctrl->kato)
mod_delayed_work(system_wq, &ctrl->ka_work, ctrl->kato * HZ);
}

View File

@@ -26,7 +26,7 @@ static u16 nvmet_passthru_override_id_ctrl(struct nvmet_req *req)
struct nvme_ctrl *pctrl = ctrl->subsys->passthru_ctrl;
u16 status = NVME_SC_SUCCESS;
struct nvme_id_ctrl *id;
u32 max_hw_sectors;
int max_hw_sectors;
int page_shift;
id = kzalloc(sizeof(*id), GFP_KERNEL);
@@ -48,6 +48,13 @@ static u16 nvmet_passthru_override_id_ctrl(struct nvmet_req *req)
max_hw_sectors = min_not_zero(pctrl->max_segments << (PAGE_SHIFT - 9),
pctrl->max_hw_sectors);
/*
* nvmet_passthru_map_sg is limitted to using a single bio so limit
* the mdts based on BIO_MAX_PAGES as well
*/
max_hw_sectors = min_not_zero(BIO_MAX_PAGES << (PAGE_SHIFT - 9),
max_hw_sectors);
page_shift = NVME_CAP_MPSMIN(ctrl->cap) + 12;
id->mdts = ilog2(max_hw_sectors) + 9 - page_shift;
@@ -180,18 +187,20 @@ static void nvmet_passthru_req_done(struct request *rq,
static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
{
int sg_cnt = req->sg_cnt;
struct scatterlist *sg;
int op_flags = 0;
struct bio *bio;
int i, ret;
if (req->sg_cnt > BIO_MAX_PAGES)
return -EINVAL;
if (req->cmd->common.opcode == nvme_cmd_flush)
op_flags = REQ_FUA;
else if (nvme_is_write(req->cmd))
op_flags = REQ_SYNC | REQ_IDLE;
bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
bio = bio_alloc(GFP_KERNEL, req->sg_cnt);
bio->bi_end_io = bio_put;
bio->bi_opf = req_op(rq) | op_flags;
@@ -201,7 +210,6 @@ static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
bio_put(bio);
return -EINVAL;
}
sg_cnt--;
}
ret = blk_rq_append_bio(rq, &bio);
@@ -236,7 +244,7 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
q = ns->queue;
}
rq = nvme_alloc_request(q, req->cmd, BLK_MQ_REQ_NOWAIT, NVME_QID_ANY);
rq = nvme_alloc_request(q, req->cmd, 0, NVME_QID_ANY);
if (IS_ERR(rq)) {
status = NVME_SC_INTERNAL;
goto out_put_ns;

View File

@@ -777,6 +777,15 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
/* See SSC3rXX or current. */
action = ACTION_FAIL;
break;
case DATA_PROTECT:
action = ACTION_FAIL;
if ((sshdr.asc == 0x0C && sshdr.ascq == 0x12) ||
(sshdr.asc == 0x55 &&
(sshdr.ascq == 0x0E || sshdr.ascq == 0x0F))) {
/* Insufficient zone resources */
blk_stat = BLK_STS_ZONE_OPEN_RESOURCE;
}
break;
default:
action = ACTION_FAIL;
break;

View File

@@ -47,10 +47,11 @@ static unsigned evtchn_2l_max_channels(void)
return EVTCHN_2L_NR_CHANNELS;
}
static void evtchn_2l_bind_to_cpu(struct irq_info *info, unsigned cpu)
static void evtchn_2l_bind_to_cpu(evtchn_port_t evtchn, unsigned int cpu,
unsigned int old_cpu)
{
clear_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, info->cpu)));
set_bit(info->evtchn, BM(per_cpu(cpu_evtchn_mask, cpu)));
clear_bit(evtchn, BM(per_cpu(cpu_evtchn_mask, old_cpu)));
set_bit(evtchn, BM(per_cpu(cpu_evtchn_mask, cpu)));
}
static void evtchn_2l_clear_pending(evtchn_port_t port)

View File

@@ -70,6 +70,57 @@
#undef MODULE_PARAM_PREFIX
#define MODULE_PARAM_PREFIX "xen."
/* Interrupt types. */
enum xen_irq_type {
IRQT_UNBOUND = 0,
IRQT_PIRQ,
IRQT_VIRQ,
IRQT_IPI,
IRQT_EVTCHN
};
/*
* Packed IRQ information:
* type - enum xen_irq_type
* event channel - irq->event channel mapping
* cpu - cpu this event channel is bound to
* index - type-specific information:
* PIRQ - vector, with MSB being "needs EIO", or physical IRQ of the HVM
* guest, or GSI (real passthrough IRQ) of the device.
* VIRQ - virq number
* IPI - IPI vector
* EVTCHN -
*/
struct irq_info {
struct list_head list;
struct list_head eoi_list;
short refcnt;
short spurious_cnt;
enum xen_irq_type type; /* type */
unsigned irq;
evtchn_port_t evtchn; /* event channel */
unsigned short cpu; /* cpu bound */
unsigned short eoi_cpu; /* EOI must happen on this cpu-1 */
unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */
u64 eoi_time; /* Time in jiffies when to EOI. */
union {
unsigned short virq;
enum ipi_vector ipi;
struct {
unsigned short pirq;
unsigned short gsi;
unsigned char vector;
unsigned char flags;
uint16_t domid;
} pirq;
} u;
};
#define PIRQ_NEEDS_EOI (1 << 0)
#define PIRQ_SHAREABLE (1 << 1)
#define PIRQ_MSI_GROUP (1 << 2)
static uint __read_mostly event_loop_timeout = 2;
module_param(event_loop_timeout, uint, 0644);
@@ -110,7 +161,7 @@ static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1};
/* IRQ <-> IPI mapping */
static DEFINE_PER_CPU(int [XEN_NR_IPIS], ipi_to_irq) = {[0 ... XEN_NR_IPIS-1] = -1};
int **evtchn_to_irq;
static int **evtchn_to_irq;
#ifdef CONFIG_X86
static unsigned long *pirq_eoi_map;
#endif
@@ -190,7 +241,7 @@ int get_evtchn_to_irq(evtchn_port_t evtchn)
}
/* Get info for IRQ */
struct irq_info *info_for_irq(unsigned irq)
static struct irq_info *info_for_irq(unsigned irq)
{
if (irq < nr_legacy_irqs())
return legacy_info_ptrs[irq];
@@ -228,7 +279,7 @@ static int xen_irq_info_common_setup(struct irq_info *info,
irq_clear_status_flags(irq, IRQ_NOREQUEST|IRQ_NOAUTOEN);
return xen_evtchn_port_setup(info);
return xen_evtchn_port_setup(evtchn);
}
static int xen_irq_info_evtchn_setup(unsigned irq,
@@ -351,7 +402,7 @@ static enum xen_irq_type type_from_irq(unsigned irq)
return info_for_irq(irq)->type;
}
unsigned cpu_from_irq(unsigned irq)
static unsigned cpu_from_irq(unsigned irq)
{
return info_for_irq(irq)->cpu;
}
@@ -391,7 +442,7 @@ static void bind_evtchn_to_cpu(evtchn_port_t evtchn, unsigned int cpu)
#ifdef CONFIG_SMP
cpumask_copy(irq_get_affinity_mask(irq), cpumask_of(cpu));
#endif
xen_evtchn_port_bind_to_cpu(info, cpu);
xen_evtchn_port_bind_to_cpu(evtchn, cpu, info->cpu);
info->cpu = cpu;
}
@@ -745,7 +796,7 @@ static unsigned int __startup_pirq(unsigned int irq)
info->evtchn = evtchn;
bind_evtchn_to_cpu(evtchn, 0);
rc = xen_evtchn_port_setup(info);
rc = xen_evtchn_port_setup(evtchn);
if (rc)
goto err;
@@ -1145,14 +1196,6 @@ static int bind_interdomain_evtchn_to_irq_chip(unsigned int remote_domain,
chip);
}
int bind_interdomain_evtchn_to_irq(unsigned int remote_domain,
evtchn_port_t remote_port)
{
return bind_interdomain_evtchn_to_irq_chip(remote_domain, remote_port,
&xen_dynamic_chip);
}
EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irq);
int bind_interdomain_evtchn_to_irq_lateeoi(unsigned int remote_domain,
evtchn_port_t remote_port)
{
@@ -1320,19 +1363,6 @@ static int bind_interdomain_evtchn_to_irqhandler_chip(
return irq;
}
int bind_interdomain_evtchn_to_irqhandler(unsigned int remote_domain,
evtchn_port_t remote_port,
irq_handler_t handler,
unsigned long irqflags,
const char *devname,
void *dev_id)
{
return bind_interdomain_evtchn_to_irqhandler_chip(remote_domain,
remote_port, handler, irqflags, devname,
dev_id, &xen_dynamic_chip);
}
EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irqhandler);
int bind_interdomain_evtchn_to_irqhandler_lateeoi(unsigned int remote_domain,
evtchn_port_t remote_port,
irq_handler_t handler,
@@ -2020,8 +2050,8 @@ void xen_setup_callback_vector(void) {}
static inline void xen_alloc_callback_vector(void) {}
#endif
static bool fifo_events = true;
module_param(fifo_events, bool, 0);
bool xen_fifo_events = true;
module_param_named(fifo_events, xen_fifo_events, bool, 0);
static int xen_evtchn_cpu_prepare(unsigned int cpu)
{
@@ -2050,10 +2080,12 @@ void __init xen_init_IRQ(void)
int ret = -EINVAL;
evtchn_port_t evtchn;
if (fifo_events)
if (xen_fifo_events)
ret = xen_evtchn_fifo_init();
if (ret < 0)
if (ret < 0) {
xen_evtchn_2l_init();
xen_fifo_events = false;
}
xen_cpu_init_eoi(smp_processor_id());

View File

@@ -138,9 +138,8 @@ static void init_array_page(event_word_t *array_page)
array_page[i] = 1 << EVTCHN_FIFO_MASKED;
}
static int evtchn_fifo_setup(struct irq_info *info)
static int evtchn_fifo_setup(evtchn_port_t port)
{
evtchn_port_t port = info->evtchn;
unsigned new_array_pages;
int ret;
@@ -186,7 +185,8 @@ static int evtchn_fifo_setup(struct irq_info *info)
return ret;
}
static void evtchn_fifo_bind_to_cpu(struct irq_info *info, unsigned cpu)
static void evtchn_fifo_bind_to_cpu(evtchn_port_t evtchn, unsigned int cpu,
unsigned int old_cpu)
{
/* no-op */
}
@@ -237,6 +237,9 @@ static bool clear_masked_cond(volatile event_word_t *word)
w = *word;
do {
if (!(w & (1 << EVTCHN_FIFO_MASKED)))
return true;
if (w & (1 << EVTCHN_FIFO_PENDING))
return false;

View File

@@ -7,65 +7,15 @@
#ifndef __EVENTS_INTERNAL_H__
#define __EVENTS_INTERNAL_H__
/* Interrupt types. */
enum xen_irq_type {
IRQT_UNBOUND = 0,
IRQT_PIRQ,
IRQT_VIRQ,
IRQT_IPI,
IRQT_EVTCHN
};
/*
* Packed IRQ information:
* type - enum xen_irq_type
* event channel - irq->event channel mapping
* cpu - cpu this event channel is bound to
* index - type-specific information:
* PIRQ - vector, with MSB being "needs EIO", or physical IRQ of the HVM
* guest, or GSI (real passthrough IRQ) of the device.
* VIRQ - virq number
* IPI - IPI vector
* EVTCHN -
*/
struct irq_info {
struct list_head list;
struct list_head eoi_list;
short refcnt;
short spurious_cnt;
enum xen_irq_type type; /* type */
unsigned irq;
evtchn_port_t evtchn; /* event channel */
unsigned short cpu; /* cpu bound */
unsigned short eoi_cpu; /* EOI must happen on this cpu */
unsigned int irq_epoch; /* If eoi_cpu valid: irq_epoch of event */
u64 eoi_time; /* Time in jiffies when to EOI. */
union {
unsigned short virq;
enum ipi_vector ipi;
struct {
unsigned short pirq;
unsigned short gsi;
unsigned char vector;
unsigned char flags;
uint16_t domid;
} pirq;
} u;
};
#define PIRQ_NEEDS_EOI (1 << 0)
#define PIRQ_SHAREABLE (1 << 1)
#define PIRQ_MSI_GROUP (1 << 2)
struct evtchn_loop_ctrl;
struct evtchn_ops {
unsigned (*max_channels)(void);
unsigned (*nr_channels)(void);
int (*setup)(struct irq_info *info);
void (*bind_to_cpu)(struct irq_info *info, unsigned cpu);
int (*setup)(evtchn_port_t port);
void (*bind_to_cpu)(evtchn_port_t evtchn, unsigned int cpu,
unsigned int old_cpu);
void (*clear_pending)(evtchn_port_t port);
void (*set_pending)(evtchn_port_t port);
@@ -83,12 +33,9 @@ struct evtchn_ops {
extern const struct evtchn_ops *evtchn_ops;
extern int **evtchn_to_irq;
int get_evtchn_to_irq(evtchn_port_t evtchn);
void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl);
struct irq_info *info_for_irq(unsigned irq);
unsigned cpu_from_irq(unsigned irq);
unsigned int cpu_from_evtchn(evtchn_port_t evtchn);
static inline unsigned xen_evtchn_max_channels(void)
@@ -100,17 +47,18 @@ static inline unsigned xen_evtchn_max_channels(void)
* Do any ABI specific setup for a bound event channel before it can
* be unmasked and used.
*/
static inline int xen_evtchn_port_setup(struct irq_info *info)
static inline int xen_evtchn_port_setup(evtchn_port_t evtchn)
{
if (evtchn_ops->setup)
return evtchn_ops->setup(info);
return evtchn_ops->setup(evtchn);
return 0;
}
static inline void xen_evtchn_port_bind_to_cpu(struct irq_info *info,
unsigned cpu)
static inline void xen_evtchn_port_bind_to_cpu(evtchn_port_t evtchn,
unsigned int cpu,
unsigned int old_cpu)
{
evtchn_ops->bind_to_cpu(info, cpu);
evtchn_ops->bind_to_cpu(evtchn, cpu, old_cpu);
}
static inline void clear_evtchn(evtchn_port_t port)

View File

@@ -260,8 +260,7 @@ static int v9fs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_bavail = rs.bavail;
buf->f_files = rs.files;
buf->f_ffree = rs.ffree;
buf->f_fsid.val[0] = rs.fsid & 0xFFFFFFFFUL;
buf->f_fsid.val[1] = (rs.fsid >> 32) & 0xFFFFFFFFUL;
buf->f_fsid = u64_to_fsid(rs.fsid);
buf->f_namelen = rs.namelen;
}
if (res != -ENOSYS)

View File

@@ -210,8 +210,7 @@ static int adfs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_namelen = sbi->s_namelen;
buf->f_bsize = sb->s_blocksize;
buf->f_ffree = (long)(buf->f_bfree * buf->f_files) / (long)buf->f_blocks;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
return 0;
}

View File

@@ -620,8 +620,7 @@ affs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_blocks = AFFS_SB(sb)->s_partition_size - AFFS_SB(sb)->s_reserved;
buf->f_bfree = free;
buf->f_bavail = free;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = AFFSNAMEMAX;
return 0;
}

View File

@@ -963,8 +963,7 @@ befs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_bavail = buf->f_bfree;
buf->f_files = 0; /* UNKNOWN */
buf->f_ffree = 0; /* UNKNOWN */
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = BEFS_NAME_LEN;
befs_debug(sb, "<--- %s", __func__);

View File

@@ -229,8 +229,7 @@ static int bfs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_bfree = buf->f_bavail = info->si_freeb;
buf->f_files = info->si_lasti + 1 - BFS_ROOT_INO;
buf->f_ffree = info->si_freei;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = BFS_NAMELEN;
return 0;
}

View File

@@ -104,8 +104,7 @@ static int ceph_statfs(struct dentry *dentry, struct kstatfs *buf)
le64_to_cpu(*((__le64 *)&monc->monmap->fsid + 1));
mutex_unlock(&monc->mutex);
buf->f_fsid.val[0] = fsid & 0xffffffff;
buf->f_fsid.val[1] = fsid >> 32;
buf->f_fsid = u64_to_fsid(fsid);
return 0;
}

View File

@@ -156,5 +156,5 @@ extern int cifs_truncate_page(struct address_space *mapping, loff_t from);
extern const struct export_operations cifs_export_ops;
#endif /* CONFIG_CIFS_NFSD_EXPORT */
#define CIFS_VERSION "2.28"
#define CIFS_VERSION "2.29"
#endif /* _CIFSFS_H */

View File

@@ -298,6 +298,10 @@ struct smb_version_operations {
/* query file data from the server */
int (*query_file_info)(const unsigned int, struct cifs_tcon *,
struct cifs_fid *, FILE_ALL_INFO *);
/* query reparse tag from srv to determine which type of special file */
int (*query_reparse_tag)(const unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb, const char *path,
__u32 *reparse_tag);
/* get server index number */
int (*get_srv_inum)(const unsigned int, struct cifs_tcon *,
struct cifs_sb_info *, const char *,

View File

@@ -656,7 +656,7 @@ smb311_posix_info_to_fattr(struct cifs_fattr *fattr, struct smb311_posix_qinfo *
static void
cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info,
struct super_block *sb, bool adjust_tz,
bool symlink)
bool symlink, u32 reparse_tag)
{
struct cifs_sb_info *cifs_sb = CIFS_SB(sb);
struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
@@ -684,8 +684,22 @@ cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info,
fattr->cf_createtime = le64_to_cpu(info->CreationTime);
fattr->cf_nlink = le32_to_cpu(info->NumberOfLinks);
if (symlink) {
if (reparse_tag == IO_REPARSE_TAG_LX_SYMLINK) {
fattr->cf_mode |= S_IFLNK | cifs_sb->mnt_file_mode;
fattr->cf_dtype = DT_LNK;
} else if (reparse_tag == IO_REPARSE_TAG_LX_FIFO) {
fattr->cf_mode |= S_IFIFO | cifs_sb->mnt_file_mode;
fattr->cf_dtype = DT_FIFO;
} else if (reparse_tag == IO_REPARSE_TAG_AF_UNIX) {
fattr->cf_mode |= S_IFSOCK | cifs_sb->mnt_file_mode;
fattr->cf_dtype = DT_SOCK;
} else if (reparse_tag == IO_REPARSE_TAG_LX_CHR) {
fattr->cf_mode |= S_IFCHR | cifs_sb->mnt_file_mode;
fattr->cf_dtype = DT_CHR;
} else if (reparse_tag == IO_REPARSE_TAG_LX_BLK) {
fattr->cf_mode |= S_IFBLK | cifs_sb->mnt_file_mode;
fattr->cf_dtype = DT_BLK;
} else if (symlink) { /* TODO add more reparse tag checks */
fattr->cf_mode = S_IFLNK;
fattr->cf_dtype = DT_LNK;
} else if (fattr->cf_cifsattrs & ATTR_DIRECTORY) {
@@ -740,8 +754,9 @@ cifs_get_file_info(struct file *filp)
rc = server->ops->query_file_info(xid, tcon, &cfile->fid, &find_data);
switch (rc) {
case 0:
/* TODO: add support to query reparse tag */
cifs_all_info_to_fattr(&fattr, &find_data, inode->i_sb, false,
false);
false, 0 /* no reparse tag */);
break;
case -EREMOTE:
cifs_create_dfs_fattr(&fattr, inode->i_sb);
@@ -910,12 +925,13 @@ cifs_get_inode_info(struct inode **inode,
struct cifs_sb_info *cifs_sb = CIFS_SB(sb);
bool adjust_tz = false;
struct cifs_fattr fattr = {0};
bool symlink = false;
bool is_reparse_point = false;
FILE_ALL_INFO *data = in_data;
FILE_ALL_INFO *tmp_data = NULL;
void *smb1_backup_rsp_buf = NULL;
int rc = 0;
int tmprc = 0;
__u32 reparse_tag = 0;
tlink = cifs_sb_tlink(cifs_sb);
if (IS_ERR(tlink))
@@ -939,7 +955,7 @@ cifs_get_inode_info(struct inode **inode,
}
rc = server->ops->query_path_info(xid, tcon, cifs_sb,
full_path, tmp_data,
&adjust_tz, &symlink);
&adjust_tz, &is_reparse_point);
data = tmp_data;
}
@@ -949,7 +965,19 @@ cifs_get_inode_info(struct inode **inode,
switch (rc) {
case 0:
cifs_all_info_to_fattr(&fattr, data, sb, adjust_tz, symlink);
/*
* If the file is a reparse point, it is more complicated
* since we have to check if its reparse tag matches a known
* special file type e.g. symlink or fifo or char etc.
*/
if ((le32_to_cpu(data->Attributes) & ATTR_REPARSE) &&
server->ops->query_reparse_tag) {
rc = server->ops->query_reparse_tag(xid, tcon, cifs_sb,
full_path, &reparse_tag);
cifs_dbg(FYI, "reparse tag 0x%x\n", reparse_tag);
}
cifs_all_info_to_fattr(&fattr, data, sb, adjust_tz,
is_reparse_point, reparse_tag);
break;
case -EREMOTE:
/* DFS link, no metadata available on this server */

View File

@@ -506,7 +506,7 @@ move_smb2_info_to_cifs(FILE_ALL_INFO *dst, struct smb2_file_all_info *src)
int
smb2_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb, const char *full_path,
FILE_ALL_INFO *data, bool *adjust_tz, bool *symlink)
FILE_ALL_INFO *data, bool *adjust_tz, bool *reparse)
{
int rc;
struct smb2_file_all_info *smb2_data;
@@ -516,7 +516,7 @@ smb2_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
struct cached_fid *cfid = NULL;
*adjust_tz = false;
*symlink = false;
*reparse = false;
smb2_data = kzalloc(sizeof(struct smb2_file_all_info) + PATH_MAX * 2,
GFP_KERNEL);
@@ -548,7 +548,7 @@ smb2_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
FILE_READ_ATTRIBUTES, FILE_OPEN, create_options,
ACL_NO_MODE, smb2_data, SMB2_OP_QUERY_INFO, cfile);
if (rc == -EOPNOTSUPP) {
*symlink = true;
*reparse = true;
create_options |= OPEN_REPARSE_POINT;
/* Failed on a symbolic link - query a reparse point info */
@@ -570,7 +570,7 @@ out:
int
smb311_posix_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb, const char *full_path,
struct smb311_posix_qinfo *data, bool *adjust_tz, bool *symlink)
struct smb311_posix_qinfo *data, bool *adjust_tz, bool *reparse)
{
int rc;
__u32 create_options = 0;
@@ -578,7 +578,7 @@ smb311_posix_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
struct smb311_posix_qinfo *smb2_data;
*adjust_tz = false;
*symlink = false;
*reparse = false;
/* BB TODO: Make struct larger when add support for parsing owner SIDs */
smb2_data = kzalloc(sizeof(struct smb311_posix_qinfo),
@@ -599,7 +599,7 @@ smb311_posix_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
ACL_NO_MODE, smb2_data, SMB2_OP_POSIX_QUERY_INFO, cfile);
if (rc == -EOPNOTSUPP) {
/* BB TODO: When support for special files added to Samba re-verify this path */
*symlink = true;
*reparse = true;
create_options |= OPEN_REPARSE_POINT;
/* Failed on a symbolic link - query a reparse point info */

View File

@@ -3034,6 +3034,133 @@ smb2_query_symlink(const unsigned int xid, struct cifs_tcon *tcon,
return rc;
}
int
smb2_query_reparse_tag(const unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb, const char *full_path,
__u32 *tag)
{
int rc;
__le16 *utf16_path = NULL;
__u8 oplock = SMB2_OPLOCK_LEVEL_NONE;
struct cifs_open_parms oparms;
struct cifs_fid fid;
struct TCP_Server_Info *server = cifs_pick_channel(tcon->ses);
int flags = 0;
struct smb_rqst rqst[3];
int resp_buftype[3];
struct kvec rsp_iov[3];
struct kvec open_iov[SMB2_CREATE_IOV_SIZE];
struct kvec io_iov[SMB2_IOCTL_IOV_SIZE];
struct kvec close_iov[1];
struct smb2_ioctl_rsp *ioctl_rsp;
struct reparse_data_buffer *reparse_buf;
u32 plen;
cifs_dbg(FYI, "%s: path: %s\n", __func__, full_path);
if (smb3_encryption_required(tcon))
flags |= CIFS_TRANSFORM_REQ;
memset(rqst, 0, sizeof(rqst));
resp_buftype[0] = resp_buftype[1] = resp_buftype[2] = CIFS_NO_BUFFER;
memset(rsp_iov, 0, sizeof(rsp_iov));
utf16_path = cifs_convert_path_to_utf16(full_path, cifs_sb);
if (!utf16_path)
return -ENOMEM;
/*
* setup smb2open - TODO add optimization to call cifs_get_readable_path
* to see if there is a handle already open that we can use
*/
memset(&open_iov, 0, sizeof(open_iov));
rqst[0].rq_iov = open_iov;
rqst[0].rq_nvec = SMB2_CREATE_IOV_SIZE;
memset(&oparms, 0, sizeof(oparms));
oparms.tcon = tcon;
oparms.desired_access = FILE_READ_ATTRIBUTES;
oparms.disposition = FILE_OPEN;
oparms.create_options = cifs_create_options(cifs_sb, OPEN_REPARSE_POINT);
oparms.fid = &fid;
oparms.reconnect = false;
rc = SMB2_open_init(tcon, server,
&rqst[0], &oplock, &oparms, utf16_path);
if (rc)
goto query_rp_exit;
smb2_set_next_command(tcon, &rqst[0]);
/* IOCTL */
memset(&io_iov, 0, sizeof(io_iov));
rqst[1].rq_iov = io_iov;
rqst[1].rq_nvec = SMB2_IOCTL_IOV_SIZE;
rc = SMB2_ioctl_init(tcon, server,
&rqst[1], fid.persistent_fid,
fid.volatile_fid, FSCTL_GET_REPARSE_POINT,
true /* is_fctl */, NULL, 0,
CIFSMaxBufSize -
MAX_SMB2_CREATE_RESPONSE_SIZE -
MAX_SMB2_CLOSE_RESPONSE_SIZE);
if (rc)
goto query_rp_exit;
smb2_set_next_command(tcon, &rqst[1]);
smb2_set_related(&rqst[1]);
/* Close */
memset(&close_iov, 0, sizeof(close_iov));
rqst[2].rq_iov = close_iov;
rqst[2].rq_nvec = 1;
rc = SMB2_close_init(tcon, server,
&rqst[2], COMPOUND_FID, COMPOUND_FID, false);
if (rc)
goto query_rp_exit;
smb2_set_related(&rqst[2]);
rc = compound_send_recv(xid, tcon->ses, server,
flags, 3, rqst,
resp_buftype, rsp_iov);
ioctl_rsp = rsp_iov[1].iov_base;
/*
* Open was successful and we got an ioctl response.
*/
if (rc == 0) {
/* See MS-FSCC 2.3.23 */
reparse_buf = (struct reparse_data_buffer *)
((char *)ioctl_rsp +
le32_to_cpu(ioctl_rsp->OutputOffset));
plen = le32_to_cpu(ioctl_rsp->OutputCount);
if (plen + le32_to_cpu(ioctl_rsp->OutputOffset) >
rsp_iov[1].iov_len) {
cifs_tcon_dbg(FYI, "srv returned invalid ioctl len: %d\n",
plen);
rc = -EIO;
goto query_rp_exit;
}
*tag = le32_to_cpu(reparse_buf->ReparseTag);
}
query_rp_exit:
kfree(utf16_path);
SMB2_open_free(&rqst[0]);
SMB2_ioctl_free(&rqst[1]);
SMB2_close_free(&rqst[2]);
free_rsp_buf(resp_buftype[0], rsp_iov[0].iov_base);
free_rsp_buf(resp_buftype[1], rsp_iov[1].iov_base);
free_rsp_buf(resp_buftype[2], rsp_iov[2].iov_base);
return rc;
}
static struct cifs_ntsd *
get_smb2_acl_by_fid(struct cifs_sb_info *cifs_sb,
const struct cifs_fid *cifsfid, u32 *pacllen)
@@ -4986,6 +5113,8 @@ struct smb_version_operations smb30_operations = {
.can_echo = smb2_can_echo,
.echo = SMB2_echo,
.query_path_info = smb2_query_path_info,
/* WSL tags introduced long after smb2.1, enable for SMB3, 3.11 only */
.query_reparse_tag = smb2_query_reparse_tag,
.get_srv_inum = smb2_get_srv_inum,
.query_file_info = smb2_query_file_info,
.set_path_size = smb2_set_path_size,
@@ -5097,6 +5226,7 @@ struct smb_version_operations smb311_operations = {
.can_echo = smb2_can_echo,
.echo = SMB2_echo,
.query_path_info = smb2_query_path_info,
.query_reparse_tag = smb2_query_reparse_tag,
.get_srv_inum = smb2_get_srv_inum,
.query_file_info = smb2_query_file_info,
.set_path_size = smb2_set_path_size,

View File

@@ -999,6 +999,31 @@ struct copychunk_ioctl_rsp {
__le32 TotalBytesWritten;
} __packed;
/* See MS-FSCC 2.3.29 and 2.3.30 */
struct get_retrieval_pointer_count_req {
__le64 StartingVcn; /* virtual cluster number (signed) */
} __packed;
struct get_retrieval_pointer_count_rsp {
__le32 ExtentCount;
} __packed;
/*
* See MS-FSCC 2.3.33 and 2.3.34
* request is the same as get_retrieval_point_count_req struct above
*/
struct smb3_extents {
__le64 NextVcn;
__le64 Lcn; /* logical cluster number */
} __packed;
struct get_retrieval_pointers_refcount_rsp {
__le32 ExtentCount;
__u32 Reserved;
__le64 StartingVcn;
struct smb3_extents extents[];
} __packed;
struct fsctl_set_integrity_information_req {
__le16 ChecksumAlgorithm;
__le16 Reserved;
@@ -1640,6 +1665,7 @@ struct smb2_file_rename_info { /* encoding of request for level 10 */
__u64 RootDirectory; /* MBZ for network operations (why says spec?) */
__le32 FileNameLength;
char FileName[]; /* New name to be assigned */
/* padding - overall struct size must be >= 24 so filename + pad >= 6 */
} __packed; /* level 10 Set */
struct smb2_file_link_info { /* encoding of request for level 11 */
@@ -1691,6 +1717,11 @@ struct smb2_file_eof_info { /* encoding of request for level 10 */
__le64 EndOfFile; /* new end of file value */
} __packed; /* level 20 Set */
struct smb2_file_reparse_point_info {
__le64 IndexNumber;
__le32 Tag;
} __packed;
struct smb2_file_network_open_info {
__le64 CreationTime;
__le64 LastAccessTime;

View File

@@ -77,6 +77,9 @@ extern void close_shroot_lease(struct cached_fid *cfid);
extern void close_shroot_lease_locked(struct cached_fid *cfid);
extern void move_smb2_info_to_cifs(FILE_ALL_INFO *dst,
struct smb2_file_all_info *src);
extern int smb2_query_reparse_tag(const unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb, const char *path,
__u32 *reparse_tag);
extern int smb2_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
struct cifs_sb_info *cifs_sb,
const char *full_path, FILE_ALL_INFO *data,

View File

@@ -103,6 +103,8 @@
#define FSCTL_SET_ZERO_ON_DEALLOC 0x00090194 /* BB add struct */
#define FSCTL_SET_SHORT_NAME_BEHAVIOR 0x000901B4 /* BB add struct */
#define FSCTL_GET_INTEGRITY_INFORMATION 0x0009027C
#define FSCTL_GET_RETRIEVAL_POINTERS_AND_REFCOUNT 0x000903d3
#define FSCTL_GET_RETRIEVAL_POINTER_COUNT 0x0009042b
#define FSCTL_QUERY_ALLOCATED_RANGES 0x000940CF
#define FSCTL_SET_DEFECT_MANAGEMENT 0x00098134 /* BB add struct */
#define FSCTL_FILE_LEVEL_TRIM 0x00098208 /* BB add struct */

View File

@@ -690,8 +690,7 @@ static int cramfs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_bavail = 0;
buf->f_files = CRAMFS_SB(sb)->files;
buf->f_ffree = 0;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = CRAMFS_MAXPATHLEN;
return 0;
}

View File

@@ -342,8 +342,7 @@ static int efs_statfs(struct dentry *dentry, struct kstatfs *buf) {
sbi->inode_blocks *
(EFS_BLOCKSIZE / sizeof(struct efs_dinode));
buf->f_ffree = sbi->inode_free; /* free inodes */
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = EFS_MAXNAMELEN; /* max filename length */
return 0;

View File

@@ -561,8 +561,7 @@ static int erofs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_namelen = EROFS_NAME_LEN;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
return 0;
}

View File

@@ -89,8 +89,7 @@ static int exfat_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_blocks = sbi->num_clusters - 2; /* clu 0 & 1 */
buf->f_bfree = buf->f_blocks - sbi->used_clusters;
buf->f_bavail = buf->f_bfree;
buf->f_fsid.val[0] = (unsigned int)id;
buf->f_fsid.val[1] = (unsigned int)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
/* Unicode utf16 255 characters */
buf->f_namelen = EXFAT_MAX_FILE_LEN * NLS_MAX_CHARSET_SIZE;
return 0;

View File

@@ -1455,8 +1455,7 @@ static int ext2_statfs (struct dentry * dentry, struct kstatfs * buf)
buf->f_namelen = EXT2_NAME_LEN;
fsid = le64_to_cpup((void *)es->s_uuid) ^
le64_to_cpup((void *)es->s_uuid + sizeof(u64));
buf->f_fsid.val[0] = fsid & 0xFFFFFFFFUL;
buf->f_fsid.val[1] = (fsid >> 32) & 0xFFFFFFFFUL;
buf->f_fsid = u64_to_fsid(fsid);
spin_unlock(&sbi->s_lock);
return 0;
}

View File

@@ -6133,8 +6133,7 @@ static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_namelen = EXT4_NAME_LEN;
fsid = le64_to_cpup((void *)es->s_uuid) ^
le64_to_cpup((void *)es->s_uuid + sizeof(u64));
buf->f_fsid.val[0] = fsid & 0xFFFFFFFFUL;
buf->f_fsid.val[1] = (fsid >> 32) & 0xFFFFFFFFUL;
buf->f_fsid = u64_to_fsid(fsid);
#ifdef CONFIG_QUOTA
if (ext4_test_inode_flag(dentry->d_inode, EXT4_INODE_PROJINHERIT) &&

View File

@@ -1442,8 +1442,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
}
buf->f_namelen = F2FS_NAME_LEN;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
#ifdef CONFIG_QUOTA
if (is_inode_flag_set(dentry->d_inode, FI_PROJ_INHERIT) &&

View File

@@ -836,8 +836,7 @@ static int fat_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_blocks = sbi->max_cluster - FAT_START_ENT;
buf->f_bfree = sbi->free_clusters;
buf->f_bavail = sbi->free_clusters;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen =
(sbi->options.isvfat ? FAT_LFN_LEN : 12) * NLS_MAX_CHARSET_SIZE;

View File

@@ -104,8 +104,7 @@ static int hfs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_bavail = buf->f_bfree;
buf->f_files = HFS_SB(sb)->fs_ablocks;
buf->f_ffree = HFS_SB(sb)->free_ablocks;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = HFS_NAMELEN;
return 0;

View File

@@ -320,8 +320,7 @@ static int hfsplus_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_bavail = buf->f_bfree;
buf->f_files = 0xFFFFFFFF;
buf->f_ffree = 0xFFFFFFFF - sbi->next_cnid;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = HFSPLUS_MAX_STRLEN;
return 0;

View File

@@ -192,8 +192,7 @@ static int hpfs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_bavail = sbi->sb_n_free;
buf->f_files = sbi->sb_dirband_size / 4;
buf->f_ffree = hpfs_get_free_dnodes(s);
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = 254;
hpfs_unlock(s);

View File

@@ -19,7 +19,9 @@
#include <linux/task_work.h>
#include <linux/blk-cgroup.h>
#include <linux/audit.h>
#include <linux/cpu.h>
#include "../kernel/sched/sched.h"
#include "io-wq.h"
#define WORKER_IDLE_TIMEOUT (5 * HZ)
@@ -123,9 +125,13 @@ struct io_wq {
refcount_t refs;
struct completion done;
struct hlist_node cpuhp_node;
refcount_t use_refs;
};
static enum cpuhp_state io_wq_online;
static bool io_worker_get(struct io_worker *worker)
{
return refcount_inc_not_zero(&worker->ref);
@@ -187,7 +193,8 @@ static bool __io_worker_unuse(struct io_wqe *wqe, struct io_worker *worker)
worker->blkcg_css = NULL;
}
#endif
if (current->signal->rlim[RLIMIT_FSIZE].rlim_cur != RLIM_INFINITY)
current->signal->rlim[RLIMIT_FSIZE].rlim_cur = RLIM_INFINITY;
return dropped_lock;
}
@@ -483,7 +490,10 @@ static void io_impersonate_work(struct io_worker *worker,
if ((work->flags & IO_WQ_WORK_CREDS) &&
worker->cur_creds != work->identity->creds)
io_wq_switch_creds(worker, work);
if (work->flags & IO_WQ_WORK_FSIZE)
current->signal->rlim[RLIMIT_FSIZE].rlim_cur = work->identity->fsize;
else if (current->signal->rlim[RLIMIT_FSIZE].rlim_cur != RLIM_INFINITY)
current->signal->rlim[RLIMIT_FSIZE].rlim_cur = RLIM_INFINITY;
io_wq_switch_blkcg(worker, work);
#ifdef CONFIG_AUDIT
current->loginuid = work->identity->loginuid;
@@ -1087,10 +1097,12 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
return ERR_PTR(-ENOMEM);
wq->wqes = kcalloc(nr_node_ids, sizeof(struct io_wqe *), GFP_KERNEL);
if (!wq->wqes) {
kfree(wq);
return ERR_PTR(-ENOMEM);
}
if (!wq->wqes)
goto err_wq;
ret = cpuhp_state_add_instance_nocalls(io_wq_online, &wq->cpuhp_node);
if (ret)
goto err_wqes;
wq->free_work = data->free_work;
wq->do_work = data->do_work;
@@ -1098,6 +1110,7 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
/* caller must already hold a reference to this */
wq->user = data->user;
ret = -ENOMEM;
for_each_node(node) {
struct io_wqe *wqe;
int alloc_node = node;
@@ -1141,9 +1154,12 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
ret = PTR_ERR(wq->manager);
complete(&wq->done);
err:
cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
for_each_node(node)
kfree(wq->wqes[node]);
err_wqes:
kfree(wq->wqes);
err_wq:
kfree(wq);
return ERR_PTR(ret);
}
@@ -1160,6 +1176,8 @@ static void __io_wq_destroy(struct io_wq *wq)
{
int node;
cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
set_bit(IO_WQ_BIT_EXIT, &wq->state);
if (wq->manager)
kthread_stop(wq->manager);
@@ -1187,3 +1205,41 @@ struct task_struct *io_wq_get_task(struct io_wq *wq)
{
return wq->manager;
}
static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
{
struct task_struct *task = worker->task;
struct rq_flags rf;
struct rq *rq;
rq = task_rq_lock(task, &rf);
do_set_cpus_allowed(task, cpumask_of_node(worker->wqe->node));
task->flags |= PF_NO_SETAFFINITY;
task_rq_unlock(rq, task, &rf);
return false;
}
static int io_wq_cpu_online(unsigned int cpu, struct hlist_node *node)
{
struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
int i;
rcu_read_lock();
for_each_node(i)
io_wq_for_each_worker(wq->wqes[i], io_wq_worker_affinity, NULL);
rcu_read_unlock();
return 0;
}
static __init int io_wq_init(void)
{
int ret;
ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "io-wq/online",
io_wq_cpu_online, NULL);
if (ret < 0)
return ret;
io_wq_online = ret;
return 0;
}
subsys_initcall(io_wq_init);

View File

@@ -17,6 +17,7 @@ enum {
IO_WQ_WORK_MM = 128,
IO_WQ_WORK_CREDS = 256,
IO_WQ_WORK_BLKCG = 512,
IO_WQ_WORK_FSIZE = 1024,
IO_WQ_HASH_SHIFT = 24, /* upper 8 bits are used for hash key */
};

View File

@@ -277,7 +277,7 @@ struct io_ring_ctx {
unsigned sq_mask;
unsigned sq_thread_idle;
unsigned cached_sq_dropped;
atomic_t cached_cq_overflow;
unsigned cached_cq_overflow;
unsigned long sq_check_overflow;
struct list_head defer_list;
@@ -585,6 +585,7 @@ enum {
REQ_F_BUFFER_SELECTED_BIT,
REQ_F_NO_FILE_TABLE_BIT,
REQ_F_WORK_INITIALIZED_BIT,
REQ_F_LTIMEOUT_ACTIVE_BIT,
/* not a real bit, just to check we're not overflowing the space */
__REQ_F_LAST_BIT,
@@ -614,7 +615,7 @@ enum {
REQ_F_CUR_POS = BIT(REQ_F_CUR_POS_BIT),
/* must not punt to workers */
REQ_F_NOWAIT = BIT(REQ_F_NOWAIT_BIT),
/* has linked timeout */
/* has or had linked timeout */
REQ_F_LINK_TIMEOUT = BIT(REQ_F_LINK_TIMEOUT_BIT),
/* regular file */
REQ_F_ISREG = BIT(REQ_F_ISREG_BIT),
@@ -628,6 +629,8 @@ enum {
REQ_F_NO_FILE_TABLE = BIT(REQ_F_NO_FILE_TABLE_BIT),
/* io_wq_work is initialized */
REQ_F_WORK_INITIALIZED = BIT(REQ_F_WORK_INITIALIZED_BIT),
/* linked timeout is active, i.e. prepared by link's head */
REQ_F_LTIMEOUT_ACTIVE = BIT(REQ_F_LTIMEOUT_ACTIVE_BIT),
};
struct async_poll {
@@ -750,8 +753,6 @@ struct io_op_def {
unsigned pollout : 1;
/* op supports buffer selection */
unsigned buffer_select : 1;
/* needs rlimit(RLIMIT_FSIZE) assigned */
unsigned needs_fsize : 1;
/* must always have async data allocated */
unsigned needs_async_data : 1;
/* size of async data needed, if any */
@@ -775,10 +776,10 @@ static const struct io_op_def io_op_defs[] = {
.hash_reg_file = 1,
.unbound_nonreg_file = 1,
.pollout = 1,
.needs_fsize = 1,
.needs_async_data = 1,
.async_size = sizeof(struct io_async_rw),
.work_flags = IO_WQ_WORK_MM | IO_WQ_WORK_BLKCG,
.work_flags = IO_WQ_WORK_MM | IO_WQ_WORK_BLKCG |
IO_WQ_WORK_FSIZE,
},
[IORING_OP_FSYNC] = {
.needs_file = 1,
@@ -789,16 +790,16 @@ static const struct io_op_def io_op_defs[] = {
.unbound_nonreg_file = 1,
.pollin = 1,
.async_size = sizeof(struct io_async_rw),
.work_flags = IO_WQ_WORK_BLKCG,
.work_flags = IO_WQ_WORK_BLKCG | IO_WQ_WORK_MM,
},
[IORING_OP_WRITE_FIXED] = {
.needs_file = 1,
.hash_reg_file = 1,
.unbound_nonreg_file = 1,
.pollout = 1,
.needs_fsize = 1,
.async_size = sizeof(struct io_async_rw),
.work_flags = IO_WQ_WORK_BLKCG,
.work_flags = IO_WQ_WORK_BLKCG | IO_WQ_WORK_FSIZE |
IO_WQ_WORK_MM,
},
[IORING_OP_POLL_ADD] = {
.needs_file = 1,
@@ -856,8 +857,7 @@ static const struct io_op_def io_op_defs[] = {
},
[IORING_OP_FALLOCATE] = {
.needs_file = 1,
.needs_fsize = 1,
.work_flags = IO_WQ_WORK_BLKCG,
.work_flags = IO_WQ_WORK_BLKCG | IO_WQ_WORK_FSIZE,
},
[IORING_OP_OPENAT] = {
.work_flags = IO_WQ_WORK_FILES | IO_WQ_WORK_BLKCG |
@@ -887,9 +887,9 @@ static const struct io_op_def io_op_defs[] = {
.needs_file = 1,
.unbound_nonreg_file = 1,
.pollout = 1,
.needs_fsize = 1,
.async_size = sizeof(struct io_async_rw),
.work_flags = IO_WQ_WORK_MM | IO_WQ_WORK_BLKCG,
.work_flags = IO_WQ_WORK_MM | IO_WQ_WORK_BLKCG |
IO_WQ_WORK_FSIZE,
},
[IORING_OP_FADVISE] = {
.needs_file = 1,
@@ -1070,6 +1070,12 @@ static void io_init_identity(struct io_identity *id)
refcount_set(&id->count, 1);
}
static inline void __io_req_init_async(struct io_kiocb *req)
{
memset(&req->work, 0, sizeof(req->work));
req->flags |= REQ_F_WORK_INITIALIZED;
}
/*
* Note: must call io_req_init_async() for the first time you
* touch any members of io_wq_work.
@@ -1081,8 +1087,7 @@ static inline void io_req_init_async(struct io_kiocb *req)
if (req->flags & REQ_F_WORK_INITIALIZED)
return;
memset(&req->work, 0, sizeof(req->work));
req->flags |= REQ_F_WORK_INITIALIZED;
__io_req_init_async(req);
/* Grab a ref if this isn't our static identity */
req->work.identity = tctx->identity;
@@ -1174,7 +1179,7 @@ static bool req_need_defer(struct io_kiocb *req, u32 seq)
struct io_ring_ctx *ctx = req->ctx;
return seq != ctx->cached_cq_tail
+ atomic_read(&ctx->cached_cq_overflow);
+ READ_ONCE(ctx->cached_cq_overflow);
}
return false;
@@ -1285,8 +1290,11 @@ static bool io_grab_identity(struct io_kiocb *req)
struct io_identity *id = req->work.identity;
struct io_ring_ctx *ctx = req->ctx;
if (def->needs_fsize && id->fsize != rlimit(RLIMIT_FSIZE))
if (def->work_flags & IO_WQ_WORK_FSIZE) {
if (id->fsize != rlimit(RLIMIT_FSIZE))
return false;
req->work.flags |= IO_WQ_WORK_FSIZE;
}
if (!(req->work.flags & IO_WQ_WORK_FILES) &&
(def->work_flags & IO_WQ_WORK_FILES) &&
@@ -1619,8 +1627,9 @@ static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force,
WRITE_ONCE(cqe->res, req->result);
WRITE_ONCE(cqe->flags, req->compl.cflags);
} else {
ctx->cached_cq_overflow++;
WRITE_ONCE(ctx->rings->cq_overflow,
atomic_inc_return(&ctx->cached_cq_overflow));
ctx->cached_cq_overflow);
}
}
@@ -1662,8 +1671,8 @@ static void __io_cqring_fill_event(struct io_kiocb *req, long res, long cflags)
* then we cannot store the request for later flushing, we need
* to drop it on the floor.
*/
WRITE_ONCE(ctx->rings->cq_overflow,
atomic_inc_return(&ctx->cached_cq_overflow));
ctx->cached_cq_overflow++;
WRITE_ONCE(ctx->rings->cq_overflow, ctx->cached_cq_overflow);
} else {
if (list_empty(&ctx->cq_overflow_list)) {
set_bit(0, &ctx->sq_check_overflow);
@@ -1865,6 +1874,12 @@ static bool __io_kill_linked_timeout(struct io_kiocb *req)
link = list_first_entry(&req->link_list, struct io_kiocb, link_list);
if (link->opcode != IORING_OP_LINK_TIMEOUT)
return false;
/*
* Can happen if a linked timeout fired and link had been like
* req -> link t-out -> link t-out [-> ...]
*/
if (!(link->flags & REQ_F_LTIMEOUT_ACTIVE))
return false;
list_del_init(&link->link_list);
wake_ev = io_link_cancel_timeout(link);
@@ -1908,10 +1923,12 @@ static struct io_kiocb *io_req_link_next(struct io_kiocb *req)
/*
* Called if REQ_F_LINK_HEAD is set, and we fail the head request
*/
static void __io_fail_links(struct io_kiocb *req)
static void io_fail_links(struct io_kiocb *req)
{
struct io_ring_ctx *ctx = req->ctx;
unsigned long flags;
spin_lock_irqsave(&ctx->completion_lock, flags);
while (!list_empty(&req->link_list)) {
struct io_kiocb *link = list_first_entry(&req->link_list,
struct io_kiocb, link_list);
@@ -1933,15 +1950,6 @@ static void __io_fail_links(struct io_kiocb *req)
}
io_commit_cqring(ctx);
}
static void io_fail_links(struct io_kiocb *req)
{
struct io_ring_ctx *ctx = req->ctx;
unsigned long flags;
spin_lock_irqsave(&ctx->completion_lock, flags);
__io_fail_links(req);
spin_unlock_irqrestore(&ctx->completion_lock, flags);
io_cqring_ev_posted(ctx);
@@ -3114,9 +3122,10 @@ static inline loff_t *io_kiocb_ppos(struct kiocb *kiocb)
* For files that don't have ->read_iter() and ->write_iter(), handle them
* by looping over ->read() or ->write() manually.
*/
static ssize_t loop_rw_iter(int rw, struct file *file, struct kiocb *kiocb,
struct iov_iter *iter)
static ssize_t loop_rw_iter(int rw, struct io_kiocb *req, struct iov_iter *iter)
{
struct kiocb *kiocb = &req->rw.kiocb;
struct file *file = req->file;
ssize_t ret = 0;
/*
@@ -3136,11 +3145,8 @@ static ssize_t loop_rw_iter(int rw, struct file *file, struct kiocb *kiocb,
if (!iov_iter_is_bvec(iter)) {
iovec = iov_iter_iovec(iter);
} else {
/* fixed buffers import bvec */
iovec.iov_base = kmap(iter->bvec->bv_page)
+ iter->iov_offset;
iovec.iov_len = min(iter->count,
iter->bvec->bv_len - iter->iov_offset);
iovec.iov_base = u64_to_user_ptr(req->rw.addr);
iovec.iov_len = req->rw.len;
}
if (rw == READ) {
@@ -3151,9 +3157,6 @@ static ssize_t loop_rw_iter(int rw, struct file *file, struct kiocb *kiocb,
iovec.iov_len, io_kiocb_ppos(kiocb));
}
if (iov_iter_is_bvec(iter))
kunmap(iter->bvec->bv_page);
if (nr < 0) {
if (!ret)
ret = nr;
@@ -3162,6 +3165,8 @@ static ssize_t loop_rw_iter(int rw, struct file *file, struct kiocb *kiocb,
ret += nr;
if (nr != iovec.iov_len)
break;
req->rw.len -= nr;
req->rw.addr += nr;
iov_iter_advance(iter, nr);
}
@@ -3351,7 +3356,7 @@ static int io_iter_do_read(struct io_kiocb *req, struct iov_iter *iter)
if (req->file->f_op->read_iter)
return call_read_iter(req->file, &req->rw.kiocb, iter);
else if (req->file->f_op->read)
return loop_rw_iter(READ, req->file, &req->rw.kiocb, iter);
return loop_rw_iter(READ, req, iter);
else
return -EINVAL;
}
@@ -3542,7 +3547,7 @@ static int io_write(struct io_kiocb *req, bool force_nonblock,
if (req->file->f_op->write_iter)
ret2 = call_write_iter(req->file, kiocb, iter);
else if (req->file->f_op->write)
ret2 = loop_rw_iter(WRITE, req->file, kiocb, iter);
ret2 = loop_rw_iter(WRITE, req, iter);
else
ret2 = -EINVAL;
@@ -4931,32 +4936,25 @@ static void io_poll_complete(struct io_kiocb *req, __poll_t mask, int error)
io_commit_cqring(ctx);
}
static void io_poll_task_handler(struct io_kiocb *req, struct io_kiocb **nxt)
{
struct io_ring_ctx *ctx = req->ctx;
if (io_poll_rewait(req, &req->poll)) {
spin_unlock_irq(&ctx->completion_lock);
return;
}
hash_del(&req->hash_node);
io_poll_complete(req, req->result, 0);
spin_unlock_irq(&ctx->completion_lock);
*nxt = io_put_req_find_next(req);
io_cqring_ev_posted(ctx);
}
static void io_poll_task_func(struct callback_head *cb)
{
struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
struct io_ring_ctx *ctx = req->ctx;
struct io_kiocb *nxt = NULL;
struct io_kiocb *nxt;
io_poll_task_handler(req, &nxt);
if (io_poll_rewait(req, &req->poll)) {
spin_unlock_irq(&ctx->completion_lock);
} else {
hash_del(&req->hash_node);
io_poll_complete(req, req->result, 0);
spin_unlock_irq(&ctx->completion_lock);
nxt = io_put_req_find_next(req);
io_cqring_ev_posted(ctx);
if (nxt)
__io_req_task_submit(nxt);
}
percpu_ref_put(&ctx->refs);
}
@@ -5110,6 +5108,7 @@ static __poll_t __io_arm_poll_handler(struct io_kiocb *req,
struct io_ring_ctx *ctx = req->ctx;
bool cancel = false;
INIT_HLIST_NODE(&req->hash_node);
io_init_poll_iocb(poll, mask, wake_func);
poll->file = req->file;
poll->wait.private = req;
@@ -5171,7 +5170,6 @@ static bool io_arm_poll_handler(struct io_kiocb *req)
req->flags |= REQ_F_POLLED;
req->apoll = apoll;
INIT_HLIST_NODE(&req->hash_node);
mask = 0;
if (def->pollin)
@@ -5353,8 +5351,6 @@ static int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe
return -EINVAL;
if (sqe->addr || sqe->ioprio || sqe->off || sqe->len || sqe->buf_index)
return -EINVAL;
if (!poll->file)
return -EBADF;
events = READ_ONCE(sqe->poll32_events);
#ifdef __BIG_ENDIAN
@@ -5372,7 +5368,6 @@ static int io_poll_add(struct io_kiocb *req)
struct io_poll_table ipt;
__poll_t mask;
INIT_HLIST_NODE(&req->hash_node);
ipt.pt._qproc = io_poll_queue_proc;
mask = __io_arm_poll_handler(req, &req->poll, &ipt, poll->events,
@@ -6122,10 +6117,9 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer)
if (!list_empty(&req->link_list)) {
prev = list_entry(req->link_list.prev, struct io_kiocb,
link_list);
if (refcount_inc_not_zero(&prev->refs)) {
if (refcount_inc_not_zero(&prev->refs))
list_del_init(&req->link_list);
prev->flags &= ~REQ_F_LINK_TIMEOUT;
} else
else
prev = NULL;
}
@@ -6182,6 +6176,7 @@ static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
if (!nxt || nxt->opcode != IORING_OP_LINK_TIMEOUT)
return NULL;
nxt->flags |= REQ_F_LTIMEOUT_ACTIVE;
req->flags |= REQ_F_LINK_TIMEOUT;
return nxt;
}
@@ -6196,7 +6191,8 @@ static void __io_queue_sqe(struct io_kiocb *req, struct io_comp_state *cs)
again:
linked_timeout = io_prep_linked_timeout(req);
if ((req->flags & REQ_F_WORK_INITIALIZED) && req->work.identity->creds &&
if ((req->flags & REQ_F_WORK_INITIALIZED) &&
(req->work.flags & IO_WQ_WORK_CREDS) &&
req->work.identity->creds != current_cred()) {
if (old_creds)
revert_creds(old_creds);
@@ -6204,7 +6200,6 @@ again:
old_creds = NULL; /* restored original creds */
else
old_creds = override_creds(req->work.identity->creds);
req->work.flags |= IO_WQ_WORK_CREDS;
}
ret = io_issue_sqe(req, true, cs);
@@ -6245,8 +6240,10 @@ punt:
if (nxt) {
req = nxt;
if (req->flags & REQ_F_FORCE_ASYNC)
if (req->flags & REQ_F_FORCE_ASYNC) {
linked_timeout = NULL;
goto punt;
}
goto again;
}
exit:
@@ -6509,12 +6506,12 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
if (id) {
struct io_identity *iod;
io_req_init_async(req);
iod = idr_find(&ctx->personality_idr, id);
if (unlikely(!iod))
return -EINVAL;
refcount_inc(&iod->count);
io_put_identity(current->io_uring, req);
__io_req_init_async(req);
get_cred(iod->creds);
req->work.identity = iod;
req->work.flags |= IO_WQ_WORK_CREDS;
@@ -8690,19 +8687,11 @@ static void io_uring_del_task_file(struct file *file)
fput(file);
}
static void __io_uring_attempt_task_drop(struct file *file)
{
struct file *old = xa_load(&current->io_uring->xa, (unsigned long)file);
if (old == file)
io_uring_del_task_file(file);
}
/*
* Drop task note for this file if we're the only ones that hold it after
* pending fput()
*/
static void io_uring_attempt_task_drop(struct file *file, bool exiting)
static void io_uring_attempt_task_drop(struct file *file)
{
if (!current->io_uring)
return;
@@ -8710,10 +8699,9 @@ static void io_uring_attempt_task_drop(struct file *file, bool exiting)
* fput() is pending, will be 2 if the only other ref is our potential
* task file note. If the task is exiting, drop regardless of count.
*/
if (!exiting && atomic_long_read(&file->f_count) != 2)
return;
__io_uring_attempt_task_drop(file);
if (fatal_signal_pending(current) || (current->flags & PF_EXITING) ||
atomic_long_read(&file->f_count) == 2)
io_uring_del_task_file(file);
}
void __io_uring_files_cancel(struct files_struct *files)
@@ -8771,16 +8759,7 @@ void __io_uring_task_cancel(void)
static int io_uring_flush(struct file *file, void *data)
{
struct io_ring_ctx *ctx = file->private_data;
/*
* If the task is going away, cancel work it may have pending
*/
if (fatal_signal_pending(current) || (current->flags & PF_EXITING))
data = NULL;
io_uring_cancel_task_requests(ctx, data);
io_uring_attempt_task_drop(file, !data);
io_uring_attempt_task_drop(file);
return 0;
}

View File

@@ -1038,8 +1038,7 @@ static int isofs_statfs (struct dentry *dentry, struct kstatfs *buf)
buf->f_bavail = 0;
buf->f_files = ISOFS_SB(sb)->s_ninodes;
buf->f_ffree = 0;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_namelen = NAME_MAX;
return 0;
}

View File

@@ -383,8 +383,7 @@ static int minix_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_files = sbi->s_ninodes;
buf->f_ffree = minix_count_free_inodes(sb);
buf->f_namelen = sbi->s_namelen;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
return 0;
}

View File

@@ -1710,7 +1710,8 @@ static const char *pick_link(struct nameidata *nd, struct path *link,
return ERR_PTR(error);
}
if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS))
if (unlikely(nd->flags & LOOKUP_NO_SYMLINKS) ||
unlikely(link->mnt->mnt_flags & MNT_NOSYMFOLLOW))
return ERR_PTR(-ELOOP);
if (!(nd->flags & LOOKUP_RCU)) {

View File

@@ -3171,6 +3171,8 @@ int path_mount(const char *dev_name, struct path *path,
mnt_flags &= ~(MNT_RELATIME | MNT_NOATIME);
if (flags & MS_RDONLY)
mnt_flags |= MNT_READONLY;
if (flags & MS_NOSYMFOLLOW)
mnt_flags |= MNT_NOSYMFOLLOW;
/* The default atime for remount is preservation */
if ((flags & MS_REMOUNT) &&

View File

@@ -651,8 +651,7 @@ static int nilfs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_files = nmaxinodes;
buf->f_ffree = nfreeinodes;
buf->f_namelen = NILFS_NAME_LEN;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
return 0;
}

View File

@@ -2643,8 +2643,7 @@ static int ntfs_statfs(struct dentry *dentry, struct kstatfs *sfs)
* the least significant 32-bits in f_fsid[0] and the most significant
* 32-bits in f_fsid[1].
*/
sfs->f_fsid.val[0] = vol->serial_no & 0xffffffff;
sfs->f_fsid.val[1] = (vol->serial_no >> 32) & 0xffffffff;
sfs->f_fsid = u64_to_fsid(vol->serial_no);
/* Maximum length of filenames. */
sfs->f_namelen = NTFS_MAX_NAME_LEN;
return 0;

View File

@@ -282,8 +282,7 @@ static int omfs_statfs(struct dentry *dentry, struct kstatfs *buf)
buf->f_blocks = sbi->s_num_blocks;
buf->f_files = sbi->s_num_blocks;
buf->f_namelen = OMFS_NAMELEN;
buf->f_fsid.val[0] = (u32)id;
buf->f_fsid.val[1] = (u32)(id >> 32);
buf->f_fsid = u64_to_fsid(id);
buf->f_bfree = buf->f_bavail = buf->f_ffree =
omfs_count_free(s);
@@ -363,12 +362,11 @@ static int omfs_get_imap(struct super_block *sb)
bh = sb_bread(sb, block++);
if (!bh)
goto nomem_free;
*ptr = kmalloc(sb->s_blocksize, GFP_KERNEL);
*ptr = kmemdup(bh->b_data, sb->s_blocksize, GFP_KERNEL);
if (!*ptr) {
brelse(bh);
goto nomem_free;
}
memcpy(*ptr, bh->b_data, sb->s_blocksize);
if (count < sb->s_blocksize)
memset((void *)*ptr + count, 0xff,
sb->s_blocksize - count);

Some files were not shown because too many files have changed in this diff Show More