Commit Graph

60722 Commits

Author SHA1 Message Date
Linus Torvalds
922c010cf2 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "22 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
  fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links
  fs: fs_parser: fix printk format warning
  checkpatch: add %pt as a valid vsprintf extension
  mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate
  drivers/block/zram/zram_drv.c: fix idle/writeback string compare
  mm/page_isolation.c: fix a wrong flag in set_migratetype_isolate()
  mm/memory_hotplug.c: fix notification in offline error path
  ptrace: take into account saved_sigmask in PTRACE{GET,SET}SIGMASK
  fs/proc/kcore.c: make kcore_modules static
  include/linux/list.h: fix list_is_first() kernel-doc
  mm/debug.c: fix __dump_page when mapping->host is not set
  mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
  include/linux/hugetlb.h: convert to use vm_fault_t
  iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debugging
  mm: add support for kmem caches in DMA32 zone
  ocfs2: fix inode bh swapping mixup in ocfs2_reflink_inodes_lock
  mm/hotplug: fix offline undo_isolate_page_range()
  fs/open.c: allow opening only regular files during execve()
  mailmap: add Changbin Du
  mm/debug.c: add a cast to u64 for atomic64_read()
  ...
2019-03-29 16:02:28 -07:00
Linus Torvalds
ffb8e45cf3 Merge tag 'for-linus-20190329' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Small set of fixes that should go into this series. This contains:

   - compat signal mask fix for io_uring (Arnd)

   - EAGAIN corner case for direct vs buffered writes for io_uring
     (Roman)

   - NVMe pull request from Christoph with various little fixes

   - sbitmap ws_active fix, which caused a perf regression for shared
     tags (me)

   - sbitmap bit ordering fix (Ming)

   - libata on-stack DMA fix (Raymond)"

* tag 'for-linus-20190329' of git://git.kernel.dk/linux-block:
  nvmet: fix error flow during ns enable
  nvmet: fix building bvec from sg list
  nvme-multipath: relax ANA state check
  nvme-tcp: fix an endianess miss-annotation
  libata: fix using DMA buffers on stack
  io_uring: offload write to async worker in case of -EAGAIN
  sbitmap: order READ/WRITE freed instance and setting clear bit
  blk-mq: fix sbitmap ws_active for shared tags
  io_uring: fix big-endian compat signal mask handling
  blk-mq: update comment for blk_mq_hctx_has_pending()
  blk-mq: use blk_mq_put_driver_tag() to put tag
2019-03-29 14:43:07 -07:00
Linus Torvalds
7376e39ad9 Merge tag 'ceph-for-5.1-rc3' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "A patch to avoid choking on multipage bvecs in the messenger and a
  small use-after-free fix"

* tag 'ceph-for-5.1-rc3' of git://github.com/ceph/ceph-client:
  ceph: fix use-after-free on symlink traversal
  libceph: fix breakage caused by multipage bvecs
2019-03-29 14:41:09 -07:00
Linus Torvalds
c6503f12d1 Merge tag 'xfs-5.1-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
 "Here are a few fixes for some corruption bugs and uninitialized
  variable problems. The few patches here have gone through a few days
  worth of fstest runs with no new problems observed.

  Changes since last update:

   - Fix a bunch of static checker complaints about uninitialized
     variables and insufficient range checks.

   - Avoid a crash when incore extent map data are corrupt.

   - Disallow FITRIM when we haven't recovered the log and know the
     metadata are stale.

   - Fix a data corruption when doing unaligned overlapping dio writes"

* tag 'xfs-5.1-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: serialize unaligned dio writes against all other dio writes
  xfs: prohibit fstrim in norecovery mode
  xfs: always init bma in xfs_bmapi_write
  xfs: fix btree scrub checking with regards to root-in-inode
  xfs: dabtree scrub needs to range-check level
  xfs: don't trip over uninitialized buffer on extent read of corrupted inode
2019-03-29 14:36:57 -07:00
YueHaibing
23da958803 fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links
Syzkaller reports:

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 5373 Comm: syz-executor.0 Not tainted 5.0.0-rc8+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:put_links+0x101/0x440 fs/proc/proc_sysctl.c:1599
Code: 00 0f 85 3a 03 00 00 48 8b 43 38 48 89 44 24 20 48 83 c0 38 48 89 c2 48 89 44 24 28 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 fe 02 00 00 48 8b 74 24 20 48 c7 c7 60 2a 9d 91
RSP: 0018:ffff8881d828f238 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff8881e01b1140 RCX: ffffffff8ee98267
RDX: 0000000000000007 RSI: ffffc90001479000 RDI: ffff8881e01b1178
RBP: dffffc0000000000 R08: ffffed103ee27259 R09: ffffed103ee27259
R10: 0000000000000001 R11: ffffed103ee27258 R12: fffffffffffffff4
R13: 0000000000000006 R14: ffff8881f59838c0 R15: dffffc0000000000
FS:  00007f072254f700(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff8b286668 CR3: 00000001f0542002 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 drop_sysctl_table+0x152/0x9f0 fs/proc/proc_sysctl.c:1629
 get_subdir fs/proc/proc_sysctl.c:1022 [inline]
 __register_sysctl_table+0xd65/0x1090 fs/proc/proc_sysctl.c:1335
 br_netfilter_init+0xbc/0x1000 [br_netfilter]
 do_one_initcall+0xfa/0x5ca init/main.c:887
 do_init_module+0x204/0x5f6 kernel/module.c:3460
 load_module+0x66b2/0x8570 kernel/module.c:3808
 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x462e99
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f072254ec58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000003
RBP: 00007f072254ec70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f072254f6bc
R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004
Modules linked in: br_netfilter(+) dvb_usb_dibusb_mc_common dib3000mc dibx000_common dvb_usb_dibusb_common dvb_usb_dw2102 dvb_usb classmate_laptop palmas_regulator cn videobuf2_v4l2 v4l2_common snd_soc_bd28623 mptbase snd_usb_usx2y snd_usbmidi_lib snd_rawmidi wmi libnvdimm lockd sunrpc grace rc_kworld_pc150u rc_core rtc_da9063 sha1_ssse3 i2c_cros_ec_tunnel adxl34x_spi adxl34x nfnetlink lib80211 i5500_temp dvb_as102 dvb_core videobuf2_common videodev media videobuf2_vmalloc videobuf2_memops udc_core lnbp22 leds_lp3952 hid_roccat_ryos s1d13xxxfb mtd vport_geneve openvswitch nf_conncount nf_nat_ipv6 nsh geneve udp_tunnel ip6_udp_tunnel snd_soc_mt6351 sis_agp phylink snd_soc_adau1761_spi snd_soc_adau1761 snd_soc_adau17x1 snd_soc_core snd_pcm_dmaengine ac97_bus snd_compress snd_soc_adau_utils snd_soc_sigmadsp_regmap snd_soc_sigmadsp raid_class hid_roccat_konepure hid_roccat_common hid_roccat c2port_duramar2150 core mdio_bcm_unimac iptable_security iptable_raw iptable_mangle
 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel hsr veth netdevsim devlink vxcan batman_adv cfg80211 rfkill chnl_net caif nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel joydev mousedev ide_pci_generic piix aesni_intel aes_x86_64 ide_core crypto_simd atkbd cryptd glue_helper serio_raw ata_generic pata_acpi i2c_piix4 floppy sch_fq_codel ip_tables x_tables ipv6 [last unloaded: lm73]
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace 770020de38961fd0 ]---

A new dir entry can be created in get_subdir and its 'header->parent' is
set to NULL.  Only after insert_header success, it will be set to 'dir',
otherwise 'header->parent' is set to NULL and drop_sysctl_table is called.
However in err handling path of get_subdir, drop_sysctl_table also be
called on 'new->header' regardless its value of parent pointer.  Then
put_links is called, which triggers NULL-ptr deref when access member of
header->parent.

In fact we have multiple error paths which call drop_sysctl_table() there,
upon failure on insert_links() we also call drop_sysctl_table().And even
in the successful case on __register_sysctl_table() we still always call
drop_sysctl_table().This patch fix it.

Link: http://lkml.kernel.org/r/20190314085527.13244-1-yuehaibing@huawei.com
Fixes: 0e47c99d7f ("sysctl: Replace root_list with links between sysctl_table_sets")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>    [3.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29 10:01:38 -07:00
Randy Dunlap
2620327852 fs: fs_parser: fix printk format warning
Fix printk format warning (seen on i386 builds) by using ptrdiff format
specifier (%t):

  fs/fs_parser.c:413:6: warning: format `%lu' expects argument of type `long unsigned int', but argument 3 has type `int' [-Wformat=]

Link: http://lkml.kernel.org/r/19432668-ffd3-fbb2-af4f-1c8e48f6cc81@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29 10:01:38 -07:00
YueHaibing
eebf364806 fs/proc/kcore.c: make kcore_modules static
Fix sparse warning:

  fs/proc/kcore.c:591:19: warning:
   symbol 'kcore_modules' was not declared. Should it be static?

Link: http://lkml.kernel.org/r/20190320135417.13272-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: James Morse <james.morse@arm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29 10:01:37 -07:00
Darrick J. Wong
e6a9467ea1 ocfs2: fix inode bh swapping mixup in ocfs2_reflink_inodes_lock
ocfs2_reflink_inodes_lock() can swap the inode1/inode2 variables so that
we always grab cluster locks in order of increasing inode number.

Unfortunately, we forget to swap the inode record buffer head pointers
when we've done this, which leads to incorrect bookkeepping when we're
trying to make the two inodes have the same refcount tree.

This has the effect of causing filesystem shutdowns if you're trying to
reflink data from inode 100 into inode 97, where inode 100 already has a
refcount tree attached and inode 97 doesn't.  The reflink code decides
to copy the refcount tree pointer from 100 to 97, but uses inode 97's
inode record to open the tree root (which it doesn't have) and blows up.
This issue causes filesystem shutdowns and metadata corruption!

Link: http://lkml.kernel.org/r/20190312214910.GK20533@magnolia
Fixes: 29ac8e856c ("ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29 10:01:37 -07:00
Tetsuo Handa
73601ea5b7 fs/open.c: allow opening only regular files during execve()
syzbot is hitting lockdep warning [1] due to trying to open a fifo
during an execve() operation.  But we don't need to open non regular
files during an execve() operation, for all files which we will need are
the executable file itself and the interpreter programs like /bin/sh and
ld-linux.so.2 .

Since the manpage for execve(2) says that execve() returns EACCES when
the file or a script interpreter is not a regular file, and the manpage
for uselib(2) says that uselib() can return EACCES, and we use
FMODE_EXEC when opening for execve()/uselib(), we can bail out if a non
regular file is requested with FMODE_EXEC set.

Since this deadlock followed by khungtaskd warnings is trivially
reproducible by a local unprivileged user, and syzbot's frequent crash
due to this deadlock defers finding other bugs, let's workaround this
deadlock until we get a chance to find a better solution.

[1] https://syzkaller.appspot.com/bug?id=b5095bfec44ec84213bac54742a82483aad578ce

Link: http://lkml.kernel.org/r/1552044017-7890-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Reported-by: syzbot <syzbot+e93a80c1bb7c5c56e522461c149f8bf55eab1b2b@syzkaller.appspotmail.com>
Fixes: 8924feff66 ("splice: lift pipe_lock out of splice_to_pipe()")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>	[4.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-29 10:01:37 -07:00
Filipe Manana
f35f06c355 Btrfs: do not allow trimming when a fs is mounted with the nologreplay option
Whan a filesystem is mounted with the nologreplay mount option, which
requires it to be mounted in RO mode as well, we can not allow discard on
free space inside block groups, because log trees refer to extents that
are not pinned in a block group's free space cache (pinning the extents is
precisely the first phase of replaying a log tree).

So do not allow the fitrim ioctl to do anything when the filesystem is
mounted with the nologreplay option, because later it can be mounted RW
without that option, which causes log replay to happen and result in
either a failure to replay the log trees (leading to a mount failure), a
crash or some silent corruption.

Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Fixes: 96da09192c ("btrfs: Introduce new mount option to disable tree log replay")
CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-03-28 18:10:27 +01:00
David Howells
8c7ae38d1c afs: Fix StoreData op marshalling
The marshalling of AFS.StoreData, AFS.StoreData64 and YFS.StoreData64 calls
generated by ->setattr() ops for the purpose of expanding a file is
incorrect due to older documentation incorrectly describing the way the RPC
'FileLength' parameter is meant to work.

The older documentation says that this is the length the file is meant to
end up at the end of the operation; however, it was never implemented this
way in any of the servers, but rather the file is truncated down to this
before the write operation is effected, and never expanded to it (and,
indeed, it was renamed to 'TruncPos' in 2014).

Fix this by setting the position parameter to the new file length and doing
a zero-lengh write there.

The bug causes Xwayland to SIGBUS due to unexpected non-expansion of a file
it then mmaps.  This can be tested by giving the following test program a
filename in an AFS directory:

	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <fcntl.h>
	#include <sys/mman.h>
	int main(int argc, char *argv[])
	{
		char *p;
		int fd;
		if (argc != 2) {
			fprintf(stderr,
				"Format: test-trunc-mmap <file>\n");
			exit(2);
		}
		fd = open(argv[1], O_RDWR | O_CREAT | O_TRUNC);
		if (fd < 0) {
			perror(argv[1]);
			exit(1);
		}
		if (ftruncate(fd, 0x140008) == -1) {
			perror("ftruncate");
			exit(1);
		}
		p = mmap(NULL, 4096, PROT_READ | PROT_WRITE,
			 MAP_SHARED, fd, 0);
		if (p == MAP_FAILED) {
			perror("mmap");
			exit(1);
		}
		p[0] = 'a';
		if (munmap(p, 4096) < 0) {
			perror("munmap");
			exit(1);
		}
		if (close(fd) < 0) {
			perror("close");
			exit(1);
		}
		exit(0);
	}

Fixes: 31143d5d51 ("AFS: implement basic file write support")
Reported-by: Jonathan Billings <jsbillin@umich.edu>
Tested-by: Jonathan Billings <jsbillin@umich.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-28 08:54:20 -07:00
Al Viro
daf5cc27ee ceph: fix use-after-free on symlink traversal
free the symlink body after the same RCU delay we have for freeing the
struct inode itself, so that traversal during RCU pathwalk wouldn't step
into freed memory.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-03-27 19:00:37 +01:00
Linus Torvalds
14c741de93 Merge tag 'nfs-for-5.1-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

  Stable fixes:
   - Fix nfs4_lock_state refcounting in nfs4_alloc_{lock,unlock}data()
   - fix mount/umount race in nlmclnt.
   - NFSv4.1 don't free interrupted slot on open

  Bugfixes:
   - Don't let RPC_SOFTCONN tasks time out if the transport is connected
   - Fix a typo in nfs_init_timeout_values()
   - Fix layoutstats handling during read failovers
   - fix uninitialized variable warning"

* tag 'nfs-for-5.1-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: fix uninitialized variable warning
  pNFS/flexfiles: Fix layoutstats handling during read failovers
  NFS: Fix a typo in nfs_init_timeout_values()
  SUNRPC: Don't let RPC_SOFTCONN tasks time out if the transport is connected
  NFSv4.1 don't free interrupted slot on open
  NFS: fix mount/umount race in nlmclnt.
  NFS: Fix nfs4_lock_state refcounting in nfs4_alloc_{lock,unlock}data()
2019-03-26 14:25:48 -07:00
Linus Torvalds
65ae689329 Merge tag 'for-5.1-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - fsync fixes: i_size for truncate vs fsync, dio vs buffered during
   snapshotting, remove complicated but incomplete assertion

 - removed excessive warnigs, misreported device stats updates

 - fix raid56 page mapping for 32bit arch

 - fixes reported by static analyzer

* tag 'for-5.1-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Btrfs: fix assertion failure on fsync with NO_HOLES enabled
  btrfs: Avoid possible qgroup_rsv_size overflow in btrfs_calculate_inode_block_rsv_size
  btrfs: Fix bound checking in qgroup_trace_new_subtree_blocks
  btrfs: raid56: properly unmap parity page in finish_parity_scrub()
  btrfs: don't report readahead errors and don't update statistics
  Btrfs: fix file corruption after snapshotting due to mix of buffered/DIO writes
  btrfs: remove WARN_ON in log_dir_items
  Btrfs: fix incorrect file size after shrinking truncate and fsync
2019-03-26 10:32:13 -07:00
Brian Foster
2032a8a27b xfs: serialize unaligned dio writes against all other dio writes
XFS applies more strict serialization constraints to unaligned
direct writes to accommodate things like direct I/O layer zeroing,
unwritten extent conversion, etc. Unaligned submissions acquire the
exclusive iolock and wait for in-flight dio to complete to ensure
multiple submissions do not race on the same block and cause data
corruption.

This generally works in the case of an aligned dio followed by an
unaligned dio, but the serialization is lost if I/Os occur in the
opposite order. If an unaligned write is submitted first and
immediately followed by an overlapping, aligned write, the latter
submits without the typical unaligned serialization barriers because
there is no indication of an unaligned dio still in-flight. This can
lead to unpredictable results.

To provide proper unaligned dio serialization, require that such
direct writes are always the only dio allowed in-flight at one time
for a particular inode. We already acquire the exclusive iolock and
drain pending dio before submitting the unaligned dio. Wait once
more after the dio submission to hold the iolock across the I/O and
prevent further submissions until the unaligned I/O completes. This
is heavy handed, but consistent with the current pre-submission
serialization for unaligned direct writes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-03-26 08:37:55 -07:00
Sascha Hauer
27942ef503 quota: remove trailing whitespaces
This removes all trailing whitespaces in fs/quota/.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-03-26 11:21:37 +01:00
Chengguang Xu
df15a2a59d quota: code cleanup for __dquot_alloc_space()
Replace (flags & DQUOT_SPACE_RESERVE) with
variable reserve.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-03-26 11:21:23 +01:00
Shuning Zhang
1206d028b2 ext2: Adjust the comment of function ext2_alloc_branch
The name of argument is error in the header comment.
@num should be @indirect_blks.
At the same time, there was a lack of description of the two parameters
@blks and @goal.
This commit therefore fixes this header comment.

Signed-off-by: Shuning Zhang <sunny.s.zhang@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2019-03-26 11:21:23 +01:00
Jan Kara
a768a9abc6 udf: Explain handling of load_nls() failure
Add comment explaining that load_nls() failure gets handled back in
udf_fill_super() to avoid false impression that it is unhandled.

Signed-off-by: Jan Kara <jack@suse.cz>
2019-03-26 11:21:23 +01:00
Roman Penyaev
9bf7933fc3 io_uring: offload write to async worker in case of -EAGAIN
In case of direct write -EAGAIN will be returned if page cache was
previously populated.  To avoid immediate completion of a request
with -EAGAIN error write has to be offloaded to the async worker,
like io_read() does.

Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-25 13:13:21 -06:00
Arnd Bergmann
9e75ad5d8f io_uring: fix big-endian compat signal mask handling
On big-endian architectures, the signal masks are differnet
between 32-bit and 64-bit tasks, so we have to use a different
function for reading them from user space.

io_cqring_wait() initially got this wrong, and always interprets
this as a native structure. This is ok on x86 and most arm64,
but not on s390, ppc64be, mips64be, sparc64 and parisc.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-25 10:06:03 -06:00
Darrick J. Wong
ed79dac98c xfs: prohibit fstrim in norecovery mode
The xfs fstrim implementation uses the free space btrees to find free
space that can be discarded.  If we haven't recovered the log, the bnobt
will be stale and we absolutely *cannot* use stale metadata to zap the
underlying storage.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2019-03-25 08:03:29 -07:00
Jeff Layton
945ab8f6de locks: wake any locks blocked on request before deadlock check
Andreas reported that he was seeing the tdbtorture test fail in some
cases with -EDEADLCK when it wasn't before. Some debugging showed that
deadlock detection was sometimes discovering the caller's lock request
itself in a dependency chain.

While we remove the request from the blocked_lock_hash prior to
reattempting to acquire it, any locks that are blocked on that request
will still be present in the hash and will still have their fl_blocker
pointer set to the current request.

This causes posix_locks_deadlock to find a deadlock dependency chain
when it shouldn't, as a lock request cannot block itself.

We are going to end up waking all of those blocked locks anyway when we
go to reinsert the request back into the blocked_lock_hash, so just do
it prior to checking for deadlocks. This ensures that any lock blocked
on the current request will no longer be part of any blocked request
chain.

URL: https://bugzilla.kernel.org/show_bug.cgi?id=202975
Fixes: 5946c4319e ("fs/locks: allow a lock request to block other requests.")
Cc: stable@vger.kernel.org
Reported-by: Andreas Schneider <asn@redhat.com>
Signed-off-by: Neil Brown <neilb@suse.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2019-03-25 08:36:24 -04:00
Linus Torvalds
17403fa277 Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
 "Miscellaneous ext4 bug fixes for 5.1"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: prohibit fstrim in norecovery mode
  ext4: cleanup bh release code in ext4_ind_remove_space()
  ext4: brelse all indirect buffer in ext4_ind_remove_space()
  ext4: report real fs size after failed resize
  ext4: add missing brelse() in add_new_gdb_meta_bg()
  ext4: remove useless ext4_pin_inode()
  ext4: avoid panic during forced reboot
  ext4: fix data corruption caused by unaligned direct AIO
  ext4: fix NULL pointer dereference while journal is aborted
2019-03-24 13:41:37 -07:00
Linus Torvalds
19caf581ba Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of x86 fixes:

   - Prevent potential NULL pointer dereferences in the HPET and HyperV
     code

   - Exclude the GART aperture from /proc/kcore to prevent kernel
     crashes on access

   - Use the correct macros for Cyrix I/O on Geode processors

   - Remove yet another kernel address printk leak

   - Announce microcode reload completion as requested by quite some
     people. Microcode loading has become popular recently.

   - Some 'Make Clang' happy fixlets

   - A few cleanups for recently added code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/gart: Exclude GART aperture from kcore
  x86/hw_breakpoints: Make default case in hw_breakpoint_arch_parse() return an error
  x86/mm/pti: Make local symbols static
  x86/cpu/cyrix: Remove {get,set}Cx86_old macros used for Cyrix processors
  x86/cpu/cyrix: Use correct macros for Cyrix calls on Geode processors
  x86/microcode: Announce reload operation's completion
  x86/hyperv: Prevent potential NULL pointer dereference
  x86/hpet: Prevent potential NULL pointer dereference
  x86/lib: Fix indentation issue, remove extra tab
  x86/boot: Restrict header scope to make Clang happy
  x86/mm: Don't leak kernel addresses
  x86/cpufeature: Fix various quality problems in the <asm/cpu_device_hd.h> header
2019-03-24 11:12:27 -07:00
Linus Torvalds
38104c0020 Merge tag '5.1-rc1-cifs-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb3 fixes from Steve French:

 - two fixes for stable for guest mount problems with smb3.1.1

 - two fixes for crediting (SMB3 flow control) on resent requests

 - a byte range lock leak fix

 - two fixes for incorrect rc mappings

* tag '5.1-rc1-cifs-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module version number
  SMB3: Fix SMB3.1.1 guest mounts to Samba
  cifs: Fix slab-out-of-bounds when tracing SMB tcon
  cifs: allow guest mounts to work for smb3.11
  fix incorrect error code mapping for OBJECTID_NOT_FOUND
  cifs: fix that return -EINVAL when do dedupe operation
  CIFS: Fix an issue with re-sending rdata when transport returning -EAGAIN
  CIFS: Fix an issue with re-sending wdata when transport returning -EAGAIN
2019-03-24 09:58:08 -07:00
Linus Torvalds
1bdd3dbfff Merge tag 'io_uring-20190323' of git://git.kernel.dk/linux-block
Pull io_uring fixes and improvements from Jens Axboe:
 "The first five in this series are heavily inspired by the work Al did
  on the aio side to fix the races there.

  The last two re-introduce a feature that was in io_uring before it got
  merged, but which I pulled since we didn't have a good way to have
  BVEC iters that already have a stable reference. These aren't
  necessarily related to block, it's just how io_uring pins fixed
  buffers"

* tag 'io_uring-20190323' of git://git.kernel.dk/linux-block:
  block: add BIO_NO_PAGE_REF flag
  iov_iter: add ITER_BVEC_FLAG_NO_REF flag
  io_uring: mark me as the maintainer
  io_uring: retry bulk slab allocs as single allocs
  io_uring: fix poll races
  io_uring: fix fget/fput handling
  io_uring: add prepped flag
  io_uring: make io_read/write return an integer
  io_uring: use regular request ref counts
2019-03-23 10:25:12 -07:00
Darrick J. Wong
18915b5873 ext4: prohibit fstrim in norecovery mode
The ext4 fstrim implementation uses the block bitmaps to find free space
that can be discarded.  If we haven't replayed the journal, the bitmaps
will be stale and we absolutely *cannot* use stale metadata to zap the
underlying storage.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2019-03-23 12:10:29 -04:00
Trond Myklebust
166bd5b889 pNFS/flexfiles: Fix layoutstats handling during read failovers
During a read failover, we may end up changing the value of
the pgio_mirror_idx, so make sure that we record the layout
stats before that update.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-03-23 12:03:58 -04:00
Trond Myklebust
5a69824393 NFS: Fix a typo in nfs_init_timeout_values()
Specifying a retrans=0 mount parameter to a NFS/TCP mount, is
inadvertently causing the NFS client to rewrite any specified
timeout parameter to the default of 60 seconds.

Fixes: a956beda19 ("NFS: Allow the mount option retrans=0")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-03-23 12:03:58 -04:00
zhangyi (F)
5e86bdda41 ext4: cleanup bh release code in ext4_ind_remove_space()
Currently, we are releasing the indirect buffer where we are done with
it in ext4_ind_remove_space(), so we can see the brelse() and
BUFFER_TRACE() everywhere.  It seems fragile and hard to read, and we
may probably forget to release the buffer some day.  This patch cleans
up the code by putting of the code which releases the buffers to the
end of the function.

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2019-03-23 11:56:01 -04:00
zhangyi (F)
674a2b2723 ext4: brelse all indirect buffer in ext4_ind_remove_space()
All indirect buffers get by ext4_find_shared() should be released no
mater the branch should be freed or not. But now, we forget to release
the lower depth indirect buffers when removing space from the same
higher depth indirect block. It will lead to buffer leak and futher
more, it may lead to quota information corruption when using old quota,
consider the following case.

 - Create and mount an empty ext4 filesystem without extent and quota
   features,
 - quotacheck and enable the user & group quota,
 - Create some files and write some data to them, and then punch hole
   to some files of them, it may trigger the buffer leak problem
   mentioned above.
 - Disable quota and run quotacheck again, it will create two new
   aquota files and write the checked quota information to them, which
   probably may reuse the freed indirect block(the buffer and page
   cache was not freed) as data block.
 - Enable quota again, it will invoke
   vfs_load_quota_inode()->invalidate_bdev() to try to clean unused
   buffers and pagecache. Unfortunately, because of the buffer of quota
   data block is still referenced, quota code cannot read the up to date
   quota info from the device and lead to quota information corruption.

This problem can be reproduced by xfstests generic/231 on ext3 file
system or ext4 file system without extent and quota features.

This patch fix this problem by releasing the missing indirect buffers,
in ext4_ind_remove_space().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
2019-03-23 11:43:05 -04:00
Kairui Song
ffc8599aa9 x86/gart: Exclude GART aperture from kcore
On machines where the GART aperture is mapped over physical RAM,
/proc/kcore contains the GART aperture range. Accessing the GART range via
/proc/kcore results in a kernel crash.

vmcore used to have the same issue, until it was fixed with commit
2a3e83c6f9 ("x86/gart: Exclude GART aperture from vmcore")', leveraging
existing hook infrastructure in vmcore to let /proc/vmcore return zeroes
when attempting to read the aperture region, and so it won't read from the
actual memory.

Apply the same workaround for kcore. First implement the same hook
infrastructure for kcore, then reuse the hook functions introduced in the
previous vmcore fix. Just with some minor adjustment, rename some functions
for more general usage, and simplify the hook infrastructure a bit as there
is no module usage yet.

Suggested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Dave Young <dyoung@redhat.com>
Link: https://lkml.kernel.org/r/20190308030508.13548-1-kasong@redhat.com
2019-03-23 12:11:49 +01:00
Steve French
cf7d624f8d cifs: update internal module version number
To 2.19

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-22 22:43:04 -05:00
Steve French
8c11a607d1 SMB3: Fix SMB3.1.1 guest mounts to Samba
Workaround problem with Samba responses to SMB3.1.1
null user (guest) mounts.  The server doesn't set the
expected flag in the session setup response so we have
to do a similar check to what is done in smb3_validate_negotiate
where we also check if the user is a null user (but not sec=krb5
since username might not be passed in on mount for Kerberos case).

Note that the commit below tightened the conditions and forced signing
for the SMB2-TreeConnect commands as per MS-SMB2.
However, this should only apply to normal user sessions and not for
cases where there is no user (even if server forgets to set the flag
in the response) since we don't have anything useful to sign with.
This is especially important now that the more secure SMB3.1.1 protocol
is in the default dialect list.

An earlier patch ("cifs: allow guest mounts to work for smb3.11") fixed
the guest mounts to Windows.

    Fixes: 6188f28bf6 ("Tree connect for SMB3.1.1 must be signed for non-encrypted shares")

Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara <palcantara@suse.de>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-22 22:42:49 -05:00
Paulo Alcantara (SUSE)
68ddb49680 cifs: Fix slab-out-of-bounds when tracing SMB tcon
This patch fixes the following KASAN report:

[  779.044746] BUG: KASAN: slab-out-of-bounds in string+0xab/0x180
[  779.044750] Read of size 1 at addr ffff88814f327968 by task trace-cmd/2812

[  779.044756] CPU: 1 PID: 2812 Comm: trace-cmd Not tainted 5.1.0-rc1+ #62
[  779.044760] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-0-ga698c89-prebuilt.qemu.org 04/01/2014
[  779.044761] Call Trace:
[  779.044769]  dump_stack+0x5b/0x90
[  779.044775]  ? string+0xab/0x180
[  779.044781]  print_address_description+0x6c/0x23c
[  779.044787]  ? string+0xab/0x180
[  779.044792]  ? string+0xab/0x180
[  779.044797]  kasan_report.cold.3+0x1a/0x32
[  779.044803]  ? string+0xab/0x180
[  779.044809]  string+0xab/0x180
[  779.044816]  ? widen_string+0x160/0x160
[  779.044822]  ? vsnprintf+0x5bf/0x7f0
[  779.044829]  vsnprintf+0x4e7/0x7f0
[  779.044836]  ? pointer+0x4a0/0x4a0
[  779.044841]  ? seq_buf_vprintf+0x79/0xc0
[  779.044848]  seq_buf_vprintf+0x62/0xc0
[  779.044855]  trace_seq_printf+0x113/0x210
[  779.044861]  ? trace_seq_puts+0x110/0x110
[  779.044867]  ? trace_raw_output_prep+0xd8/0x110
[  779.044876]  trace_raw_output_smb3_tcon_class+0x9f/0xc0
[  779.044882]  print_trace_line+0x377/0x890
[  779.044888]  ? tracing_buffers_read+0x300/0x300
[  779.044893]  ? ring_buffer_read+0x58/0x70
[  779.044899]  s_show+0x6e/0x140
[  779.044906]  seq_read+0x505/0x6a0
[  779.044913]  vfs_read+0xaf/0x1b0
[  779.044919]  ksys_read+0xa1/0x130
[  779.044925]  ? kernel_write+0xa0/0xa0
[  779.044931]  ? __do_page_fault+0x3d5/0x620
[  779.044938]  do_syscall_64+0x63/0x150
[  779.044944]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  779.044949] RIP: 0033:0x7f62c2c2db31
[ 779.044955] Code: fe ff ff 48 8d 3d 17 9e 09 00 48 83 ec 08 e8 96 02
02 00 66 0f 1f 44 00 00 8b 05 fa fc 2c 00 48 63 ff 85 c0 75 13 31 c0
0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 55 53 48 89 d5 48
89
[  779.044958] RSP: 002b:00007ffd6e116678 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  779.044964] RAX: ffffffffffffffda RBX: 0000560a38be9260 RCX: 00007f62c2c2db31
[  779.044966] RDX: 0000000000002000 RSI: 00007ffd6e116710 RDI: 0000000000000003
[  779.044966] RDX: 0000000000002000 RSI: 00007ffd6e116710 RDI: 0000000000000003
[  779.044969] RBP: 00007f62c2ef5420 R08: 0000000000000000 R09: 0000000000000003
[  779.044972] R10: ffffffffffffffa8 R11: 0000000000000246 R12: 00007ffd6e116710
[  779.044975] R13: 0000000000002000 R14: 0000000000000d68 R15: 0000000000002000

[  779.044981] Allocated by task 1257:
[  779.044987]  __kasan_kmalloc.constprop.5+0xc1/0xd0
[  779.044992]  kmem_cache_alloc+0xad/0x1a0
[  779.044997]  getname_flags+0x6c/0x2a0
[  779.045003]  user_path_at_empty+0x1d/0x40
[  779.045008]  do_faccessat+0x12a/0x330
[  779.045012]  do_syscall_64+0x63/0x150
[  779.045017]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  779.045019] Freed by task 1257:
[  779.045023]  __kasan_slab_free+0x12e/0x180
[  779.045029]  kmem_cache_free+0x85/0x1b0
[  779.045034]  filename_lookup.part.70+0x176/0x250
[  779.045039]  do_faccessat+0x12a/0x330
[  779.045043]  do_syscall_64+0x63/0x150
[  779.045048]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[  779.045052] The buggy address belongs to the object at ffff88814f326600
which belongs to the cache names_cache of size 4096
[  779.045057] The buggy address is located 872 bytes to the right of
4096-byte region [ffff88814f326600, ffff88814f327600)
[  779.045058] The buggy address belongs to the page:
[  779.045062] page:ffffea00053cc800 count:1 mapcount:0 mapping:ffff88815b191b40 index:0x0 compound_mapcount: 0
[  779.045067] flags: 0x200000000010200(slab|head)
[  779.045075] raw: 0200000000010200 dead000000000100 dead000000000200 ffff88815b191b40
[  779.045081] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[  779.045083] page dumped because: kasan: bad access detected

[  779.045085] Memory state around the buggy address:
[  779.045089]  ffff88814f327800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  779.045093]  ffff88814f327880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  779.045097] >ffff88814f327900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  779.045099]                                                           ^
[  779.045103]  ffff88814f327980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  779.045107]  ffff88814f327a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  779.045109] ==================================================================
[  779.045110] Disabling lock debugging due to kernel taint

Correctly assign tree name str for smb3_tcon event.

Signed-off-by: Paulo Alcantara (SUSE) <paulo@paulo.ac>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-22 22:36:54 -05:00
Ronnie Sahlberg
e71ab2aa06 cifs: allow guest mounts to work for smb3.11
Fix Guest/Anonymous sessions so that they work with SMB 3.11.

The commit noted below tightened the conditions and forced signing for
the SMB2-TreeConnect commands as per MS-SMB2.
However, this should only apply to normal user sessions and not for
Guest/Anonumous sessions.

Fixes: 6188f28bf6 ("Tree connect for SMB3.1.1 must be signed for non-encrypted shares")

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-22 22:36:54 -05:00
Steve French
85f9987b23 fix incorrect error code mapping for OBJECTID_NOT_FOUND
It was mapped to EIO which can be confusing when user space
queries for an object GUID for an object for which the server
file system doesn't support (or hasn't saved one).

As Amir Goldstein suggested this is similar to ENOATTR
(equivalently ENODATA in Linux errno definitions) so
changing NT STATUS code mapping for OBJECTID_NOT_FOUND
to ENODATA.

Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Amir Goldstein <amir73il@gmail.com>
2019-03-22 22:36:54 -05:00
Xiaoli Feng
b073a08016 cifs: fix that return -EINVAL when do dedupe operation
dedupe_file_range operations is combiled into remap_file_range.
But it's always skipped for dedupe operations in function
cifs_remap_file_range.

Example to test:
Before this patch:
  # dd if=/dev/zero of=cifs/file bs=1M count=1
  # xfs_io -c "dedupe cifs/file 4k 64k 4k" cifs/file
  XFS_IOC_FILE_EXTENT_SAME: Invalid argument

After this patch:
  # dd if=/dev/zero of=cifs/file bs=1M count=1
  # xfs_io -c "dedupe cifs/file 4k 64k 4k" cifs/file
  XFS_IOC_FILE_EXTENT_SAME: Operation not supported

Influence for xfstests:
generic/091
generic/112
generic/127
generic/263
These tests report this error "do_copy_range:: Invalid
argument" instead of "FIDEDUPERANGE: Invalid argument".
Because there are still two bugs cause these test failed.
https://bugzilla.kernel.org/show_bug.cgi?id=202935
https://bugzilla.kernel.org/show_bug.cgi?id=202785

Signed-off-by: Xiaoli Feng <fengxiaoli0714@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-22 22:36:54 -05:00
Long Li
0b0dfd5921 CIFS: Fix an issue with re-sending rdata when transport returning -EAGAIN
When sending a rdata, transport may return -EAGAIN. In this case
we should re-obtain credits because the session may have been
reconnected.

Change in v2: adjust_credits before re-sending

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-22 22:36:54 -05:00
Long Li
d53e292f0f CIFS: Fix an issue with re-sending wdata when transport returning -EAGAIN
When sending a wdata, transport may return -EAGAIN. In this case
we should re-obtain credits because the session may have been
reconnected.

Change in v2: adjust_credits before re-sending

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-22 22:36:54 -05:00
Arnd Bergmann
a5fdd713d2 jfs: fix bogus variable self-initialization
A statement was originally added in 2006 to shut up a gcc warning,
now but now clang warns about it:

fs/jfs/jfs_txnmgr.c:1932:15: error: variable 'pxd' is uninitialized when used within its own initialization
      [-Werror,-Wuninitialized]
                pxd_t pxd = pxd;        /* truncated extent of xad */
                      ~~~   ^~~

Modern versions of gcc are fine without the silly assignment, so just
drop it. Tested with gcc-4.6 (released 2011), 4.7, 4.8, and 4.9.

Fixes: c9e3ad6021 ("JFS: Get rid of "may be used uninitialized" warnings")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2019-03-22 09:38:37 -05:00
Linus Torvalds
0939221e64 Merge tag 'fixes_for_v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull udf fixes from Jan Kara:
 "Two udf error handling fixes"

* tag 'fixes_for_v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Propagate errors from udf_truncate_extents()
  udf: Fix crash on IO error during truncate
2019-03-21 10:31:55 -07:00
Linus Torvalds
7294fbd441 Merge tag 'fsnotify_for_v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify fixes from Jan Kara:
 "One inotify and one fanotify fix"

* tag 'fsnotify_for_v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: Allow copying of file handle to userspace
  inotify: Fix fsnotify_mark refcount leak in inotify_update_existing_watch()
2019-03-21 10:24:00 -07:00
Ondrej Mosnacek
e19dfdc83b kernfs: initialize security of newly created nodes
Use the new security_kernfs_init_security() hook to allow LSMs to
possibly assign a non-default security context to a newly created kernfs
node based on the attributes of the new node and also its parent node.

This fixes an issue with cgroupfs under SELinux, where newly created
cgroup subdirectories/files would not inherit its parent's context if
it had been set explicitly to a non-default value (other than the genfs
context specified by the policy). This can be reproduced as follows (on
Fedora/RHEL):

    # mkdir /sys/fs/cgroup/unified/test
    # # Need permissive to change the label under Fedora policy:
    # setenforce 0
    # chcon -t container_file_t /sys/fs/cgroup/unified/test
    # ls -lZ /sys/fs/cgroup/unified
    total 0
    -r--r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.controllers
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.max.depth
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.max.descendants
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.procs
    -r--r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.stat
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.subtree_control
    -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.threads
    drwxr-xr-x.  2 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 init.scope
    drwxr-xr-x. 26 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:21 system.slice
    drwxr-xr-x.  3 root root system_u:object_r:container_file_t:s0 0 Jan 29 03:15 test
    drwxr-xr-x.  3 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 user.slice
    # mkdir /sys/fs/cgroup/unified/test/subdir

Actual result:

    # ls -ldZ /sys/fs/cgroup/unified/test/subdir
    drwxr-xr-x. 2 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir

Expected result:

    # ls -ldZ /sys/fs/cgroup/unified/test/subdir
    drwxr-xr-x. 2 root root unconfined_u:object_r:container_file_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir

Link: https://github.com/SELinuxProject/selinux-kernel/issues/39

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-03-20 22:10:15 -04:00
Ondrej Mosnacek
b230d5aba2 LSM: add new hook for kernfs node initialization
This patch introduces a new security hook that is intended for
initializing the security data for newly created kernfs nodes, which
provide a way of storing a non-default security context, but need to
operate independently from mounts (and therefore may not have an
associated inode at the moment of creation).

The main motivation is to allow kernfs nodes to inherit the context of
the parent under SELinux, similar to the behavior of
security_inode_init_security(). Other LSMs may implement their own logic
for handling the creation of new nodes.

This patch also adds helper functions to <linux/kernfs.h> for
getting/setting security xattrs of a kernfs node so that LSMs hooks are
able to do their job. Other important attributes should be accessible
direcly in the kernfs_node fields (in case there is need for more, then
new helpers should be added to kernfs.h along with the patch that needs
them).

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: more manual merge fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-03-20 22:01:02 -04:00
Ondrej Mosnacek
0ac6075a32 kernfs: use simple_xattrs for security attributes
Replace the special handling of security xattrs with simple_xattrs, as
is already done for the trusted xattrs. This simplifies the code and
allows LSMs to use more than just a single xattr to do their business.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: manual merge fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-03-20 21:54:33 -04:00
Ondrej Mosnacek
d0c9c153b4 kernfs: do not alloc iattrs in kernfs_xattr_get
This is a read-only operation, so we can simply return -ENODATA if
kn->iattr is NULL.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor merge fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-03-20 21:47:36 -04:00
Ondrej Mosnacek
0589521962 kernfs: clean up struct kernfs_iattrs
Right now, kernfs_iattrs embeds the whole struct iattr, even though it
doesn't really use half of its fields... This both leads to wasting
space and makes the code look awkward. Let's just list the few fields
we need directly in struct kernfs_iattrs.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: merged a number of chunks manually due to fuzz]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-03-20 21:42:14 -04:00
David Howells
cf3cba4a42 vfs: syscall: Add fspick() to select a superblock for reconfiguration
Provide an fspick() system call that can be used to pick an existing
mountpoint into an fs_context which can thereafter be used to reconfigure a
superblock (equivalent of the superblock side of -o remount).

This looks like:

	int fd = fspick(AT_FDCWD, "/mnt",
			FSPICK_CLOEXEC | FSPICK_NO_AUTOMOUNT);
	fsconfig(fd, FSCONFIG_SET_FLAG, "intr", NULL, 0);
	fsconfig(fd, FSCONFIG_SET_FLAG, "noac", NULL, 0);
	fsconfig(fd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);

At the point of fspick being called, the file descriptor referring to the
filesystem context is in exactly the same state as the one that was created
by fsopen() after fsmount() has been successfully called.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-api@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-20 18:49:06 -04:00