Commit Graph

48409 Commits

Author SHA1 Message Date
Theodore Ts'o
cd6bb35bf7 ext4: use more strict checks for inodes_per_block on mount
Centralize the checks for inodes_per_block and be more strict to make
sure the inodes_per_block_group can't end up being zero.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: stable@vger.kernel.org
2016-11-18 13:28:30 -05:00
Theodore Ts'o
5aee0f8a3f ext4: fix in-superblock mount options processing
Fix a large number of problems with how we handle mount options in the
superblock.  For one, if the string in the superblock is long enough
that it is not null terminated, we could run off the end of the string
and try to interpret superblocks fields as characters.  It's unlikely
this will cause a security problem, but it could result in an invalid
parse.  Also, parse_options is destructive to the string, so in some
cases if there is a comma-separated string, it would be modified in
the superblock.  (Fortunately it only happens on file systems with a
1k block size.)

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2016-11-18 13:24:26 -05:00
Theodore Ts'o
9e47a4c9fc ext4: sanity check the block and cluster size at mount time
If the block size or cluster size is insane, reject the mount.  This
is important for security reasons (although we shouldn't be just
depending on this check).

Ref: http://www.securityfocus.com/archive/1/539661
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506
Reported-by: Borislav Petkov <bp@alien8.de>
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2016-11-18 13:00:24 -05:00
Alexey Dobriyan
c7d03a00b5 netns: make struct pernet_operations::id unsigned int
Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

	void f(long *p, int i)
	{
		g(p[i]);
	}

  roughly translates to

	movsx	rsi, esi
	mov	rdi, [rsi+...]
	call 	g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

	static inline void *net_generic(const struct net *net, int id)
	{
		...
		ptr = ng->ptr[id - 1];
		...
	}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
	function                                     old     new   delta
	nfsd4_lock                                  3886    3959     +73
	tipc_link_build_proto_msg                   1096    1140     +44
	mac80211_hwsim_new_radio                    2776    2808     +32
	tipc_mon_rcv                                1032    1058     +26
	svcauth_gss_legacy_init                     1413    1429     +16
	tipc_bcbase_select_primary                   379     392     +13
	nfsd4_exchange_id                           1247    1260     +13
	nfsd4_setclientid_confirm                    782     793     +11
		...
	put_client_renew_locked                      494     480     -14
	ip_set_sockfn_get                            730     716     -14
	geneve_sock_add                              829     813     -16
	nfsd4_sequence_done                          721     703     -18
	nlmclnt_lookup_host                          708     686     -22
	nfsd4_lockt                                 1085    1063     -22
	nfs_get_client                              1077    1050     -27
	tcf_bpf_init                                1106    1076     -30
	nfsd4_encode_fattr                          5997    5930     -67
	Total: Before=154856051, After=154854321, chg -0.00%

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 10:59:15 -05:00
Linus Torvalds
bec1b089ab Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "A couple of regression fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix iov_iter_advance() for ITER_PIPE
  xattr: Fix setting security xattrs on sockfs
2016-11-17 13:49:30 -08:00
Linus Torvalds
d46bc34da9 Merge tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs fix from Mike Marshall:
 "orangefs: add .owner to debugfs file_operations

  Without ".owner = THIS_MODULE" it is possible to crash the kernel by
  unloading the Orangefs module while someone is reading debugfs files"

* tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  orangefs: add .owner to debugfs file_operations
2016-11-17 13:45:57 -08:00
Christoph Hellwig
542ff7bf18 block: new direct I/O implementation
Similar to the simple fast path, but we now need a dio structure to
track multiple-bio completions.  It's basically a cut-down version
of the new iomap-based direct I/O code for filesystems, but without
all the logic to call into the filesystem for extent lookup or
allocation, and without the complex I/O completion workqueue handler
for AIO - instead we just use the FUA bit on the bios to ensure
data is flushed to stable storage.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-17 13:35:11 -07:00
Jens Axboe
78250c02d9 block: make __blkdev_direct_IO_sync() support O_SYNC/DSYNC
Split the op setting code into a helper, use it in both places.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-17 13:35:05 -07:00
Jens Axboe
72ecad22d9 block: support a full bio worth of IO for simplified bdev direct-io
Just alloc the bio_vec array if we exceed the inline limit.

Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-17 13:35:02 -07:00
Christoph Hellwig
189ce2b9dc block: fast-path for small and simple direct I/O requests
This patch adds a small and simple fast patch for small direct I/O
requests on block devices that don't use AIO.  Between the neat
bio_iov_iter_get_pages helper that avoids allocating a page array
for get_user_pages and the on-stack bio and biovec this avoid memory
allocations and atomic operations entirely in the direct I/O code
(lower levels might still do memory allocations and will usually
have at least some atomic operations, though).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Tested-By: Stephen Bates <sbates@raithlin.com>
Reviewed-By: Stephen Bates <sbates@raithlin.com>
2016-11-17 13:34:45 -07:00
Seth Forshee
f97df70b1c xenfs: Use proc_create_mount_point() to create /proc/xen
Mounting proc in user namespace containers fails if the xenbus
filesystem is mounted on /proc/xen because this directory fails
the "permanently empty" test. proc_create_mount_point() exists
specifically to create such mountpoints in proc but is currently
proc-internal. Export this interface to modules, then use it in
xenbus when creating /proc/xen.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2016-11-17 13:52:18 +01:00
Andreas Gruenbacher
4a59015372 xattr: Fix setting security xattrs on sockfs
The IOP_XATTR flag is set on sockfs because sockfs supports getting the
"system.sockprotoname" xattr.  Since commit 6c6ef9f2, this flag is checked for
setxattr support as well.  This is wrong on sockfs because security xattr
support there is supposed to be provided by security_inode_setsecurity.  The
smack security module relies on socket labels (xattrs).

Fix this by adding a security xattr handler on sockfs that returns
-EAGAIN, and by checking for -EAGAIN in setxattr.

We cannot simply check for -EOPNOTSUPP in setxattr because there are
filesystems that neither have direct security xattr support nor support
via security_inode_setsecurity.  A more proper fix might be to move the
call to security_inode_setsecurity into sockfs, but it's not clear to me
if that is safe: we would end up calling security_inode_post_setxattr after
that as well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-11-17 00:00:23 -05:00
Linus Torvalds
984573abf8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
 "A regression fix and bug fix bound for stable"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix fuse_write_end() if zero bytes were copied
  fuse: fix root dentry initialization
2016-11-16 09:20:10 -08:00
Mike Marshall
19ff7fcc76 orangefs: add .owner to debugfs file_operations
Without ".owner = THIS_MODULE" it is possible to crash the kernel
by unloading the Orangefs module while someone is reading debugfs
files.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-11-16 11:52:19 -05:00
Nicolas Pitre
baa73d9e47 posix-timers: Make them configurable
Some embedded systems have no use for them.  This removes about
25KB from the kernel binary size when configured out.

Corresponding syscalls are routed to a stub logging the attempt to
use those syscalls which should be enough of a clue if they were
disabled without proper consideration. They are: timer_create,
timer_gettime: timer_getoverrun, timer_settime, timer_delete,
clock_adjtime, setitimer, getitimer, alarm.

The clock_settime, clock_gettime, clock_getres and clock_nanosleep
syscalls are replaced by simple wrappers compatible with CLOCK_REALTIME,
CLOCK_MONOTONIC and CLOCK_BOOTTIME only which should cover the vast
majority of use cases with very little code.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: linux-kbuild@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Michal Marek <mmarek@suse.com>
Cc: Edward Cree <ecree@solarflare.com>
Link: http://lkml.kernel.org/r/1478841010-28605-7-git-send-email-nicolas.pitre@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:26:35 +01:00
Kees Cook
fc46d4e453 ramoops: add pdata NULL check to ramoops_probe
This adds a check for a NULL platform data, which should only be possible
if a driver incorrectly sets up a probe request without also having defined
the platform_data structure. This is based on a patch from Geliang Tang.

Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:32 -08:00
Namhyung Kim
70ad35db33 pstore: Convert console write to use ->write_buf
Maybe I'm missing something, but I don't know why it needs to copy the
input buffer to psinfo->buf and then write.  Instead we can write the
input buffer directly.  The only implementation that supports console
message (i.e. ramoops) already does it for ftrace messages.

For the upcoming virtio backend driver, it needs to protect psinfo->buf
overwritten from console messages.  If it could use ->write_buf method
instead of ->write, the problem will be solved easily.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:32 -08:00
Namhyung Kim
e9e360b08a pstore: Protect unlink with read_mutex
When update_ms is set, pstore_get_records() will be called when there's
a new entry.  But unlink can be called at the same time and might
contend with the open-read-close loop.  Depending on the implementation
of platform driver, it may be safe or not.  But I think it'd be better
to protect those race in the first place.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:31 -08:00
Joel Fernandes
7a0032f504 pstore: Use global ftrace filters for function trace filtering
Currently, pstore doesn't have any filters setup for function tracing.
This has the associated overhead and may not be useful for users looking
for tracing specific set of functions.

ftrace's regular function trace filtering is done writing to
tracing/set_ftrace_filter however this is not available if not requested.
In order to be able to use this feature, the support to request global
filtering introduced earlier in the series should be requested before
registering the ftrace ops. Here we do the same.

Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:30 -08:00
Kees Cook
a5d23b956c pstore: Clarify context field przs as dprzs
Since "przs" (persistent ram zones) is a general name in the code now, so
rename the Oops-dump zones to dprzs from przs.

Based on a patch from Nobuhiro Iwamatsu.

Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:29 -08:00
Kees Cook
c443a5f3f1 pstore: improve error report for failed setup
When setting ramoops record sizes, sometimes it's not clear which
parameters contributed to the allocation failure. This adds a per-zone
name and expands the failure reports.

Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:28 -08:00
Joel Fernandes
2fbea82bbb pstore: Merge per-CPU ftrace records into one
Up until this patch, each of the per CPU ftrace buffers appear as a
separate ftrace-ramoops-N file. In this patch we merge all the zones into
one and populate a single ftrace-ramoops-0 file.

Signed-off-by: Joel Fernandes <joelaf@google.com>
[kees: clarified variables names, added -ENOMEM handling]
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:28 -08:00
Joel Fernandes
fbccdeb8d7 pstore: Add ftrace timestamp counter
In preparation for merging the per CPU buffers into one buffer when
we retrieve the pstore ftrace data, we store the timestamp as a
counter in the ftrace pstore record.  We store the CPU number as well
if !PSTORE_CPU_IN_IP, in this case we shift the counter and may lose
ordering there but we preserve the same record size. The timestamp counter
is also racy, and not doing any locking or synchronization here results
in the benefit of lower overhead. Since we don't care much here for exact
ordering of function traces across CPUs, we don't synchronize and may lose
some counter updates but I'm ok with that.

Using trace_clock() results in much lower performance so avoid using it
since we don't want accuracy in timestamp and need a rough ordering to
perform merge.

Signed-off-by: Joel Fernandes <joelaf@google.com>
[kees: updated commit message, added comments]
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:27 -08:00
Joel Fernandes
a1cf53ac6d ramoops: Split ftrace buffer space into per-CPU zones
If the RAMOOPS_FLAG_FTRACE_PER_CPU flag is passed to ramoops pdata, split
the ftrace space into multiple zones depending on the number of CPUs.

This speeds up the performance of function tracing by about 280% in my
tests as we avoid the locking. The trade off being lesser space available
per CPU. Let the ramoops user decide which option they want based on pdata
flag.

Signed-off-by: Joel Fernandes <joelaf@google.com>
[kees: added max_ftrace_cnt to track size, added DT logic and docs]
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:26 -08:00
Kees Cook
de83209249 pstore: Make ramoops_init_przs generic for other prz arrays
Currently ramoops_init_przs() is hard wired only for panic dump zone
array. In preparation for the ftrace zone array (one zone per-cpu) and pmsg
zone array, make the function more generic to be able to handle this case.

Heavily based on similar work from Joel Fernandes.

Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:25 -08:00
Joel Fernandes
663deb4788 pstore: Allow prz to control need for locking
In preparation of not locking at all for certain buffers depending on if
there's contention, make locking optional depending on the initialization
of the prz.

Signed-off-by: Joel Fernandes <joelaf@google.com>
[kees: moved locking flag into prz instead of via caller arguments]
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-11-15 16:34:25 -08:00
David S. Miller
bb598c1b8c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of bug fixes in 'net' overlapping other changes in
'net-next-.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-15 10:54:36 -05:00
Miklos Szeredi
59c3b76cc6 fuse: fix fuse_write_end() if zero bytes were copied
If pos is at the beginning of a page and copied is zero then page is not
zeroed but is marked uptodate.

Fix by skipping everything except unlock/put of page if zero bytes were
copied.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 6b12c1b37e ("fuse: Implement write_begin/write_end callbacks")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-11-15 12:34:21 +01:00
Eric Whitney
d5c8dab6a8 ext4: remove parameter from ext4_xattr_ibody_set()
The parameter "handle" isn't used.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-14 21:56:48 -05:00
Eric Whitney
88e0387769 ext4: allow inode expansion for nojournal file systems
Runs of xfstest ext4/022 on nojournal file systems result in failures
because the inodes of some of its test files do not expand as expected.
The cause is a conditional in ext4_mark_inode_dirty() that prevents inode
expansion unless the test file system has a journal.  Remove this
unnecessary restriction.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-14 21:48:35 -05:00
Deepa Dinamani
eeca7ea1ba ext4: use current_time() for inode timestamps
CURRENT_TIME_SEC and CURRENT_TIME are not y2038 safe.
current_time() will be transitioned to be y2038 safe
along with vfs.

current_time() returns timestamps according to the
granularities set in the super_block.
The granularity check in ext4_current_time() to call
current_time() or CURRENT_TIME_SEC is not required.
Use current_time() directly to obtain timestamps
unconditionally, and remove ext4_current_time().

Quota files are assumed to be on the same filesystem.
Hence, use current_time() for these files as well.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
2016-11-14 21:40:10 -05:00
Chandan Rajendra
30a9d7afe7 ext4: fix stack memory corruption with 64k block size
The number of 'counters' elements needed in 'struct sg' is
super_block->s_blocksize_bits + 2. Presently we have 16 'counters'
elements in the array. This is insufficient for block sizes >= 32k. In
such cases the memcpy operation performed in ext4_mb_seq_groups_show()
would cause stack memory corruption.

Fixes: c9de560ded
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
2016-11-14 21:26:26 -05:00
Chandan Rajendra
69e43e8cc9 ext4: fix mballoc breakage with 64k block size
'border' variable is set to a value of 2 times the block size of the
underlying filesystem. With 64k block size, the resulting value won't
fit into a 16-bit variable. Hence this commit changes the data type of
'border' to 'unsigned int'.

Fixes: c9de560ded
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: stable@vger.kernel.org
2016-11-14 21:04:37 -05:00
Andreas Gruenbacher
db978da8fa proc: Pass file mode to proc_pid_make_inode
Pass the file mode of the proc inode to be created to
proc_pid_make_inode.  In proc_pid_make_inode, initialize inode->i_mode
before calling security_task_to_inode.  This allows selinux to set
isec->sclass right away without introducing "half-initialized" inode
security structs.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-11-14 15:39:48 -05:00
Julia Lawall
7ba630f54c nfsd: constify reply_cache_stats_operations structure
reply_cache_stats_operations, of type struct file_operations, is never
modified, so declare it as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Jeff Layton <jlayton@poochiereds.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-11-14 15:24:19 -05:00
J. Bruce Fields
8838203667 nfsd: update workqueue creation
No real change in functionality, but the old interface seems to be
deprecated.

We don't actually care about ordering necessarily, but we do depend on
running at most one work item at a time: nfsd4_process_cb_update()
assumes that no other thread is running it, and that no new callbacks
are starting while it's running.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-11-14 11:13:43 -05:00
Theodore Ts'o
1566a48aaa ext4: don't lock buffer in ext4_commit_super if holding spinlock
If there is an error reported in mballoc via ext4_grp_locked_error(),
the code is holding a spinlock, so ext4_commit_super() must not try to
lock the buffer head, or else it will trigger a BUG:

  BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:358
  in_atomic(): 1, irqs_disabled(): 0, pid: 993, name: mount
  CPU: 0 PID: 993 Comm: mount Not tainted 4.9.0-rc1-clouder1 #62
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
   ffff880006423548 ffffffff81318c89 ffffffff819ecdd0 0000000000000166
   ffff880006423558 ffffffff810810b0 ffff880006423580 ffffffff81081153
   ffff880006e5a1a0 ffff88000690e400 0000000000000000 ffff8800064235c0
  Call Trace:
    [<ffffffff81318c89>] dump_stack+0x67/0x9e
    [<ffffffff810810b0>] ___might_sleep+0xf0/0x140
    [<ffffffff81081153>] __might_sleep+0x53/0xb0
    [<ffffffff8126c1dc>] ext4_commit_super+0x19c/0x290
    [<ffffffff8126e61a>] __ext4_grp_locked_error+0x14a/0x230
    [<ffffffff81081153>] ? __might_sleep+0x53/0xb0
    [<ffffffff812822be>] ext4_mb_generate_buddy+0x1de/0x320

Since ext4_grp_locked_error() calls ext4_commit_super with sync == 0
(and it is the only caller which does so), avoid locking and unlocking
the buffer in this case.

This can result in races with ext4_commit_super() if there are other
problems (which is what commit 4743f83990 was trying to address),
but a Warning is better than BUG.

Fixes: 4743f83990
Cc: stable@vger.kernel.org # 4.9
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2016-11-13 22:02:29 -05:00
Theodore Ts'o
d0abb36db4 ext4: allow ext4_ext_truncate() to return an error
Return errors to the caller instead of declaring the file system
corrupted.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2016-11-13 22:02:28 -05:00
Theodore Ts'o
2c98eb5ea2 ext4: allow ext4_truncate() to return an error
This allows us to properly propagate errors back up to
ext4_truncate()'s callers.  This also means we no longer have to
silently ignore some errors (e.g., when trying to add the inode to the
orphan inode list).

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
2016-11-13 22:02:26 -05:00
Theodore Ts'o
6da22013bb Merge branch 'fscrypt' into origin 2016-11-13 22:02:22 -05:00
Theodore Ts'o
a2f6d9c4c0 Merge branch 'dax-4.10-iomap-pmd' into origin 2016-11-13 22:02:15 -05:00
Eric Biggers
a6e0891286 fscrypto: don't use on-stack buffer for key derivation
With the new (in 4.9) option to use a virtually-mapped stack
(CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for
the scatterlist crypto API because they may not be directly mappable to
struct page.  get_crypt_info() was using a stack buffer to hold the
output from the encryption operation used to derive the per-file key.
Fix it by using a heap buffer.

This bug could most easily be observed in a CONFIG_DEBUG_SG kernel
because this allowed the BUG in sg_set_buf() to be triggered.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-13 21:56:25 -05:00
Eric Biggers
08ae877f4e fscrypto: don't use on-stack buffer for filename encryption
With the new (in 4.9) option to use a virtually-mapped stack
(CONFIG_VMAP_STACK), stack buffers cannot be used as input/output for
the scatterlist crypto API because they may not be directly mappable to
struct page.  For short filenames, fname_encrypt() was encrypting a
stack buffer holding the padded filename.  Fix it by encrypting the
filename in-place in the output buffer, thereby making the temporary
buffer unnecessary.

This bug could most easily be observed in a CONFIG_DEBUG_SG kernel
because this allowed the BUG in sg_set_buf() to be triggered.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-13 21:56:19 -05:00
David Gstir
9c4bb8a3a9 fscrypt: Let fs select encryption index/tweak
Avoid re-use of page index as tweak for AES-XTS when multiple parts of
same page are encrypted. This will happen on multiple (partial) calls of
fscrypt_encrypt_page on same page.
page->index is only valid for writeback pages.

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-13 20:18:16 -05:00
David Gstir
0b93e1b94b fscrypt: Constify struct inode pointer
Some filesystems, such as UBIFS, maintain a const pointer for struct
inode.

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-13 20:18:01 -05:00
David Gstir
7821d4dd45 fscrypt: Enable partial page encryption
Not all filesystems work on full pages, thus we should allow them to
hand partial pages to fscrypt for en/decryption.

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-13 18:55:21 -05:00
David Gstir
b50f7b268b fscrypt: Allow fscrypt_decrypt_page() to function with non-writeback pages
Some filesystem might pass pages which do not have page->mapping->host
set to the encrypted inode. We want the caller to explicitly pass the
corresponding inode.

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-13 18:53:10 -05:00
David Gstir
1c7dcf69ee fscrypt: Add in-place encryption mode
ext4 and f2fs require a bounce page when encrypting pages. However, not
all filesystems will need that (eg. UBIFS). This is handled via a
flag on fscrypt_operations where a fs implementation can select in-place
encryption over using a bounce page (which is the default).

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-11-13 18:47:04 -05:00
Deepa Dinamani
362ad5d58e fs: jfs: Replace CURRENT_TIME_SEC by current_time()
jfs uses nanosecond granularity for filesystem timestamps.
Only this assignment is not using nanosecond granularity.
Use current_time() to get the right granularity.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2016-11-11 15:51:39 -06:00
Jens Axboe
bbd7bb7017 block: move poll code to blk-mq
The poll code is blk-mq specific, let's move it to blk-mq.c. This
is a prep patch for improving the polling code.

Signed-off-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2016-11-11 13:40:25 -07:00