Commit Graph

56429 Commits

Author SHA1 Message Date
Abhi Das
b066a4eebd gfs2: forcibly flush ail to relieve memory pressure
On systems with low memory, it is possible for gfs2 to infinitely
loop in balance_dirty_pages() under heavy IO (creating sparse files).

balance_dirty_pages() attempts to write out the dirty pages via
gfs2_writepages() but none are found because these dirty pages are
being used by the journaling code in the ail. Normally, the journal
has an upper threshold which when hit triggers an automatic flush
of the ail. But this threshold can be higher than the number of
allowable dirty pages and result in the ail never being flushed.

This patch forces an ail flush when gfs2_writepages() fails to write
anything. This is a good indication that the ail might be holding
some dirty pages.

Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-10 10:51:03 -05:00
Andreas Gruenbacher
a91323e255 gfs2: Clean up waiting on glocks
The prepare_to_wait_on_glock and finish_wait_on_glock functions introduced in
commit 56a365be "gfs2: gfs2_glock_get: Wait on freeing glocks" are
better removed, resulting in cleaner code.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-10 10:51:02 -05:00
Andreas Gruenbacher
6a1c8f6dcf gfs2: Defer deleting inodes under memory pressure
When under memory pressure and an inode's link count has dropped to
zero, defer deleting the inode to the delete workqueue.  This avoids
calling into DLM under memory pressure, which can deadlock.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-10 10:49:13 -05:00
Andreas Gruenbacher
71c1b21368 gfs2: gfs2_evict_inode: Put glocks asynchronously
gfs2_evict_inode is called to free inodes under memory pressure.  The
function calls into DLM when an inode's last cluster-wide reference goes
away (remote unlink) and to release the glock and associated DLM lock
before finally destroying the inode.  However, if DLM is blocked on
memory to become available, calling into DLM again will deadlock.

Avoid that by decoupling releasing glocks from destroying inodes in that
case: with gfs2_glock_queue_put, glocks will be dequeued asynchronously
in work queue context, when the associated inodes have likely already
been destroyed.

With this change, inodes can end up being unlinked, remote-unlink can be
triggered, and then the inode can be reallocated before all
remote-unlink callbacks are processed.  To detect that, revalidate the
link count in gfs2_evict_inode to make sure we're not deleting an
allocated, referenced inode.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-10 10:45:21 -05:00
Andreas Gruenbacher
eebd2e813f gfs2: Get rid of gfs2_set_nlink
Remove gfs2_set_nlink which prevents the link count of an inode from
becoming non-zero once it has reached zero.  The next commit reduces the
amount of waiting on glocks when an inode is evicted from memory.  With
that, an inode can become reallocated before all the remote-unlink
callbacks from a previous delete are processed, which causes the link
count to change from zero to non-zero.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-10 10:42:11 -05:00
Andreas Gruenbacher
0515480ad4 gfs2: gfs2_glock_get: Wait on freeing glocks
Keep glocks in their hash table until they are freed instead of removing
them when their last reference is dropped.  This allows to wait for any
previous instances of a glock to go away in gfs2_glock_get before
creating a new glocks.

Special thanks to Andy Price for finding and fixing a problem which also
required us to delete the rcu_read_unlock from the error case in function
gfs2_glock_get.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-10 10:39:31 -05:00
Peter Zijlstra
a9668cd6ee locking: Remove smp_mb__before_spinlock()
Now that there are no users of smp_mb__before_spinlock() left, remove
it entirely.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:29:03 +02:00
Peter Zijlstra
ff7a5fb0f1 overlayfs, locking: Remove smp_mb__before_spinlock() usage
While we could replace the smp_mb__before_spinlock() with the new
smp_mb__after_spinlock(), the normal pattern is to use
smp_store_release() to publish an object that is used for
lockless_dereference() -- and mirrors the regular rcu_assign_pointer()
/ rcu_dereference() patterns.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:29:02 +02:00
Aleksa Sarai
74dc3384fc sched/debug: Use task_pid_nr_ns in /proc/$pid/sched
It appears as though the addition of the PID namespace did not update
the output code for /proc/*/sched, which resulted in it providing PIDs
that were not self-consistent with the /proc mount. This additionally
made it trivial to detect whether a process was inside &init_pid_ns from
userspace, making container detection trivial:

   https://github.com/jessfraz/amicontained

This leads to situations such as:

  % unshare -pmf
  % mount -t proc proc /proc
  % head -n1 /proc/1/sched
  head (10047, #threads: 1)

Fix this by just using task_pid_nr_ns for the output of /proc/*/sched.
All of the other uses of task_pid_nr in kernel/sched/debug.c are from a
sysctl context and thus don't need to be namespaced.

Signed-off-by: Aleksa Sarai <asarai@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jess Frazelle <acidburn@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: cyphar@cyphar.com
Link: http://lkml.kernel.org/r/20170806044141.5093-1-asarai@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:18:19 +02:00
Chao Yu
b0af6d491a f2fs: add app/fs io stat
This patch enables inner app/fs io stats and introduces below virtual fs
nodes for exposing stats info:
/sys/fs/f2fs/<dev>/iostat_enable
/proc/fs/f2fs/<dev>/iostat_info

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix wrong stat assignment]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-09 21:43:58 -07:00
Yunlong Song
35ee82ca13 f2fs: do not change the valid_block value if cur_valid_map was wrongly set or cleared
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-09 17:45:23 -07:00
Yunlong Song
6415fedc57 f2fs: update cur_valid_map_mir together with cur_valid_map
When cur_valid_map passes the f2fs_test_and_set(,clear)_bit test,
cur_valid_map_mir update is skipped unlikely, so fix it. The fix
now changes the mirror check together with cur_valid_map all the
time.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: Fix unused variable and add unlikely for corner condition.]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-09 17:45:21 -07:00
David S. Miller
3118e6e19d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The UDP offload conflict is dealt with by simply taking what is
in net-next where we have removed all of the UFO handling code
entirely.

The TCP conflict was a case of local variables in a function
being removed from both net and net-next.

In netvsc we had an assignment right next to where a missing
set of u64 stats sync object inits were added.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-09 16:28:45 -07:00
Trond Myklebust
c0ca0e5934 NFSv4: Ignore NFS4ERR_OLD_STATEID in nfs41_check_open_stateid()
If the call to TEST_STATEID returns NFS4ERR_OLD_STATEID, then it just
means we raced with other calls to OPEN.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-08-09 13:36:56 -04:00
Andreas Gruenbacher
61b91cfdc6 gfs2: Fix trivial typos
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-09 09:36:39 -05:00
Bob Peterson
b2fb7dab7f GFS2: Delete debugfs files only after we evict the glocks
This patch moves the call to gfs2_delete_debugfs_file so that it
comes after the glock hash table has been cleared. This way we
can query the debugfs files if umount hangs.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-09 09:36:39 -05:00
Bob Peterson
645ebd49f0 GFS2: Don't waste time locking lru_lock for non-lru glocks
Before this patch, glock_dq would call gfs2_glock_remove_from_lru.
For glocks that are never put on the LRU, such as the transaction
glock, this just takes the spin_lock, determines there's nothing to
be done because the list is empty, then unlocks again. This was
causing unnecessary lock contention on the lru_lock spin_lock.
This patch adds a check for GLOF_LRU in the glops before taking
the spin_lock.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-09 09:36:39 -05:00
Bob Peterson
2d821a8b71 GFS2: Don't bother trying to add rgrps to the lru list
This patch removes a call to gfs2_glock_add_to_lru from function
gfs2_clear_rgrpd. The call is just a waste of time because as soon
as it adds it to the lru_list, the call to gfs2_glock_put takes it
back off again.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2017-08-09 09:36:38 -05:00
Bob Peterson
240c6235df GFS2: Clear gl_object when deleting an inode in gfs2_delete_inode
This patch adds some calls to clear gl_object in function
gfs2_delete_inode. Since we are deleting the inode, and the glock
typically outlives the inode in core, we must clear gl_object
so subsequent use of the glock (e.g. for a new inode in its place)
will not have the old pointer sitting there. In error cases we
need to tidy up after ourselves. In non-error cases, we need to
clear gl_object before we set the block free in the bitmap so
residules aren't left for potential inode creators.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2017-08-09 09:36:38 -05:00
Bob Peterson
9c1b28081f GFS2: Clear gl_object if gfs2_create_inode fails
If function gfs2_create_inode fails after the inode has been
created (for example, if the inode_refresh fails for some reason)
the function was setting gl_object but never clearing it again.
The glocks are left pointing to a freed inode. This patch adds
the calls to clear gl_object in the appropriate error paths.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
2017-08-09 09:36:26 -05:00
Weston Andros Adamson
1feb26162b nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays
The client was freeing the nfs4_ff_layout_ds, but not the contained
nfs4_ff_ds_version array.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-08-08 17:18:10 -04:00
Linus Torvalds
1742c0f055 Merge tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
 "I have a couple more bug fixes for you today:

   - fix memory leak when issuing discard

   - fix propagation of the dax inode flag"

* tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Fix per-inode DAX flag inheritance
  xfs: Fix leak of discard bio
2017-08-07 18:16:22 -07:00
David S. Miller
fde6af4729 Merge tag 'mlx5-shared-2017-08-07' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:

====================
mlx5-shared-2017-08-07

This series includes some mlx5 updates for both net-next and rdma trees.

From Saeed,
Core driver updates to allow selectively building the driver with
or without some large driver components, such as
	- E-Switch (Ethernet SRIOV support).
	- Multi-Physical Function Switch (MPFs) support.
For that we split E-Switch and MPFs functionalities into separate files.

From Erez,
Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration
is taking place and until it completes.

From Rabie,
Increase the maximum supported flow counters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 10:42:09 -07:00
Guoqing Jiang
1c24285372 dlm: use sock_create_lite inside tcp_accept_from_sock
With commit 0ffdaf5b41 ("net/sock: add WARN_ON(parent->sk)
in sock_graft()"), a calltrace happened as follows:

[  457.018340] WARNING: CPU: 0 PID: 15623 at ./include/net/sock.h:1703 inet_accept+0x135/0x140
...
[  457.018381] RIP: 0010:inet_accept+0x135/0x140
[  457.018381] RSP: 0018:ffffc90001727d18 EFLAGS: 00010286
[  457.018383] RAX: 0000000000000001 RBX: ffff880012413000 RCX: 0000000000000001
[  457.018384] RDX: 000000000000018a RSI: 00000000fffffe01 RDI: ffffffff8156fae8
[  457.018384] RBP: ffffc90001727d38 R08: 0000000000000000 R09: 0000000000004305
[  457.018385] R10: 0000000000000001 R11: 0000000000004304 R12: ffff880035ae7a00
[  457.018386] R13: ffff88001282af10 R14: ffff880034e4e200 R15: 0000000000000000
[  457.018387] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[  457.018388] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  457.018389] CR2: 00007fdec22f9000 CR3: 0000000002b5a000 CR4: 00000000000006f0
[  457.018395] Call Trace:
[  457.018402]  tcp_accept_from_sock.part.8+0x12d/0x449 [dlm]
[  457.018405]  ? vprintk_emit+0x248/0x2d0
[  457.018409]  tcp_accept_from_sock+0x3f/0x50 [dlm]
[  457.018413]  process_recv_sockets+0x3b/0x50 [dlm]
[  457.018415]  process_one_work+0x138/0x370
[  457.018417]  worker_thread+0x4d/0x3b0
[  457.018419]  kthread+0x109/0x140
[  457.018421]  ? rescuer_thread+0x320/0x320
[  457.018422]  ? kthread_park+0x60/0x60
[  457.018424]  ret_from_fork+0x25/0x30

Since newsocket created by sock_create_kern sets it's
sock by the path:

	sock_create_kern -> __sock_creat
			 ->pf->create => inet_create
			 -> sock_init_data

Then WARN_ON is triggered by "con->sock->ops->accept =>
inet_accept -> sock_graft", it also means newsock->sk
is leaked since sock_graft will replace it with a new
sk.

To resolve the issue, we need to use sock_create_lite
instead of sock_create_kern, like commit 0933a578cd
("rds: tcp: use sock_create_lite() to create the accept
socket") did.

Reported-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Edwin Török
55acdd926f dlm: avoid double-free on error path in dlm_device_{register,unregister}
Can be reproduced when running dlm_controld (tested on 4.4.x, 4.12.4):
 # seq 1 100 | xargs -P0 -n1 dlm_tool join
 # seq 1 100 | xargs -P0 -n1 dlm_tool leave

misc_register fails due to duplicate sysfs entry, which causes
dlm_device_register to free ls->ls_device.name.
In dlm_device_deregister the name was freed again, causing memory
corruption.

According to the comment in dlm_device_deregister the name should've been
set to NULL when registration fails,
so this patch does that.

sysfs: cannot create duplicate filename '/dev/char/10:1'
------------[ cut here ]------------
warning: cpu: 1 pid: 4450 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x56/0x70
modules linked in: msr rfcomm dlm ccm bnep dm_crypt uvcvideo
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev
btusb media btrtl btbcm btintel bluetooth ecdh_generic intel_rapl
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm
snd_hda_codec_hdmi irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel thinkpad_acpi pcbc nvram snd_seq_midi
snd_seq_midi_event aesni_intel snd_hda_codec_realtek snd_hda_codec_generic
snd_rawmidi aes_x86_64 crypto_simd glue_helper snd_hda_intel snd_hda_codec
cryptd intel_cstate arc4 snd_hda_core snd_seq snd_seq_device snd_hwdep
iwldvm intel_rapl_perf mac80211 joydev input_leds iwlwifi serio_raw
cfg80211 snd_pcm shpchp snd_timer snd mac_hid mei_me lpc_ich mei soundcore
sunrpc parport_pc ppdev lp parport autofs4 i915 psmouse
 e1000e ahci libahci i2c_algo_bit sdhci_pci ptp drm_kms_helper sdhci
pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops drm wmi video
cpu: 1 pid: 4450 comm: dlm_test.exe not tainted 4.12.4-041204-generic
hardware name: lenovo 232425u/232425u, bios g2et82ww (2.02 ) 09/11/2012
task: ffff96b0cbabe140 task.stack: ffffb199027d0000
rip: 0010:sysfs_warn_dup+0x56/0x70
rsp: 0018:ffffb199027d3c58 eflags: 00010282
rax: 0000000000000038 rbx: ffff96b0e2c49158 rcx: 0000000000000006
rdx: 0000000000000000 rsi: 0000000000000086 rdi: ffff96b15e24dcc0
rbp: ffffb199027d3c70 r08: 0000000000000001 r09: 0000000000000721
r10: ffffb199027d3c00 r11: 0000000000000721 r12: ffffb199027d3cd1
r13: ffff96b1592088f0 r14: 0000000000000001 r15: ffffffffffffffef
fs:  00007f78069c0700(0000) gs:ffff96b15e240000(0000)
knlgs:0000000000000000
cs:  0010 ds: 0000 es: 0000 cr0: 0000000080050033
cr2: 000000178625ed28 cr3: 0000000091d3e000 cr4: 00000000001406e0
call trace:
 sysfs_do_create_link_sd.isra.2+0x9e/0xb0
 sysfs_create_link+0x25/0x40
 device_add+0x5a9/0x640
 device_create_groups_vargs+0xe0/0xf0
 device_create_with_groups+0x3f/0x60
 ? snprintf+0x45/0x70
 misc_register+0x140/0x180
 device_write+0x6a8/0x790 [dlm]
 __vfs_write+0x37/0x160
 ? apparmor_file_permission+0x1a/0x20
 ? security_file_permission+0x3b/0xc0
 vfs_write+0xb5/0x1a0
 sys_write+0x55/0xc0
 ? sys_fcntl+0x5d/0xb0
 entry_syscall_64_fastpath+0x1e/0xa9
rip: 0033:0x7f78083454bd
rsp: 002b:00007f78069bbd30 eflags: 00000293 orig_rax: 0000000000000001
rax: ffffffffffffffda rbx: 0000000000000006 rcx: 00007f78083454bd
rdx: 000000000000009c rsi: 00007f78069bee00 rdi: 0000000000000005
rbp: 00007f77f8000a20 r08: 000000000000fcf0 r09: 0000000000000032
r10: 0000000000000024 r11: 0000000000000293 r12: 00007f78069bde00
r13: 00007f78069bee00 r14: 000000000000000a r15: 00007f78069bbd70
code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef e8 2c c8
ff ff 4c 89 e2 48 89 de 48 c7 c7 b0 8e 0c a8 e8 41 e8 ed ff <0f> ff 48 89
df e8 00 d5 f4 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84
---[ end trace 40412246357cc9e0 ]---

dlm: 59f24629-ae39-44e2-9030-397ebc2eda26: leaving the lockspace group...
bug: unable to handle kernel null pointer dereference at 0000000000000001
ip: [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140
pgd 0
oops: 0000 [#1] smp
modules linked in: dlm 8021q garp mrp stp llc openvswitch nf_defrag_ipv6
nf_conntrack libcrc32c iptable_filter dm_multipath crc32_pclmul dm_mod
aesni_intel psmouse aes_x86_64 sg ablk_helper cryptd lrw gf128mul
glue_helper i2c_piix4 nls_utf8 tpm_tis tpm isofs nfsd auth_rpcgss
oid_registry nfs_acl lockd grace sunrpc xen_wdt ip_tables x_tables autofs4
hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi 8139too
serio_raw ata_piix 8139cp mii uhci_hcd ehci_pci ehci_hcd libata
scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod ipv6
cpu: 0 pid: 394 comm: systemd-udevd tainted: g w 4.4.0+0 #1
hardware name: xen hvm domu, bios 4.7.2-2.2 05/11/2017
task: ffff880002410000 ti: ffff88000243c000 task.ti: ffff88000243c000
rip: e030:[<ffffffff811a3b4a>] [<ffffffff811a3b4a>]
kmem_cache_alloc+0x7a/0x140
rsp: e02b:ffff88000243fd90 eflags: 00010202
rax: 0000000000000000 rbx: ffff8800029864d0 rcx: 000000000007b36c
rdx: 000000000007b36b rsi: 00000000024000c0 rdi: ffff880036801c00
rbp: ffff88000243fdc0 r08: 0000000000018880 r09: 0000000000000054
r10: 000000000000004a r11: ffff880034ace6c0 r12: 00000000024000c0
r13: ffff880036801c00 r14: 0000000000000001 r15: ffffffff8118dcc2
fs: 00007f0ab77548c0(0000) gs:ffff880036e00000(0000) knlgs:0000000000000000
cs: e033 ds: 0000 es: 0000 cr0: 0000000080050033
cr2: 0000000000000001 cr3: 000000000332d000 cr4: 0000000000040660
stack:
ffffffff8118dc90 ffff8800029864d0 0000000000000000 ffff88003430b0b0
ffff880034b78320 ffff88003430b0b0 ffff88000243fdf8 ffffffff8118dcc2
ffff8800349c6700 ffff8800029864d0 000000000000000b 00007f0ab7754b90
call trace:
[<ffffffff8118dc90>] ? anon_vma_fork+0x60/0x140
[<ffffffff8118dcc2>] anon_vma_fork+0x92/0x140
[<ffffffff8107033e>] copy_process+0xcae/0x1a80
[<ffffffff8107128b>] _do_fork+0x8b/0x2d0
[<ffffffff81071579>] sys_clone+0x19/0x20
[<ffffffff815a30ae>] entry_syscall_64_fastpath+0x12/0x71
] code: f6 75 1c 4c 89 fa 44 89 e6 4c 89 ef e8 a7 e4 00 00 41 f7 c4 00 80
00 00 49 89 c6 74 47 eb 32 49 63 45 20 48 8d 4a 01 4d 8b 45 00 <49> 8b 1c
06 4c 89 f0 65 49 0f c7 08 0f 94 c0 84 c0 74 ac 49 63
rip [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140
rsp <ffff88000243fd90>
cr2: 0000000000000001
--[ end trace 70cb9fd1b164a0e8 ]--

CC: stable@vger.kernel.org
Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Bhumika Goyal
417f7c59ed dlm: constify kset_uevent_ops structure
Declare kset_uevent_ops structure as const as it is only passed as an
argument to the function kset_create_and_add. This argument is of type
const, so declare the structure as const.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Zhu Lingshan
3b0e761ba8 dlm: print log message when cluster name is not set
Print a message when a cluster name is not specified by
the caller.  In this case the cluster name configured
for the dlm is used without any validation that it is
the cluster expected by the application.

Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
2ab93ae138 dlm: Delete an unnecessary variable initialisation in dlm_ls_start()
The local variable "rv" is reassigned by a statement at the beginning.
Thus omit the explicit initialisation.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
d12ad1a964 dlm: Improve a size determination in two functions
Replace the specification of two data structures by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
2f48e06102 dlm: Use kcalloc() in two functions
* Multiplications for the size determination of memory allocations
  indicated that array data structures should be processed.
  Thus reuse the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of data structures by pointer dereferences
  to make the corresponding size determinations a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
790854becc dlm: Use kmalloc_array() in make_member_array()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kmalloc_array".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data type by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
0d37eca752 dlm: Delete an error message for a failed memory allocation in dlm_recover_waiters_pre()
Omit an extra message for a memory allocation failure in this function.

Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
102e67d4e3 dlm: Improve a size determination in dlm_recover_waiters_pre()
Replace the specification of a data structure by a pointer dereference
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
fbb1008151 dlm: Use kcalloc() in dlm_scan_waiters()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "kcalloc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
2c257e96df dlm: Improve a size determination in table_seq_start()
Replace the specification of a data structure by a pointer dereference
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
41922ce831 dlm: Add spaces for better code readability
The script "checkpatch.pl" pointed information out like the following.

CHECK: spaces preferred around that '+' (ctx:VxV)

Thus fix the affected source code places.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Markus Elfring
653996ca8d dlm: Replace six seq_puts() calls by seq_putc()
Six single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Gang He
8e1743748b dlm: Make dismatch error message more clear
This change will try to make this error message more clear,
since the upper applications (e.g. ocfs2) invoke dlm_new_lockspace
to create a new lockspace with passing a cluster name. Sometimes,
dlm_new_lockspace return failure while two cluster names dismatch,
the user is a little confused since this line error message is not
enough obvious.

Signed-off-by: Gang He <ghe@suse.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
Vlad Tsyrklevich
8286d6b14c dlm: Fix kernel memory disclosure
Clear the 'unused' field and the uninitialized padding in 'lksb' to
avoid leaking memory to userland in copy_result_to_user().

Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07 11:23:09 -05:00
zhangyi (F)
41e327b586 quota: correct space limit check
Currently we compare total space (curspace + rsvspace)
with space limit in quota-tools when setting grace time
and also in check_bdq(), but we missing rsvspace in
somewhere else, correct them. This patch also fix incorrect
zero dqb_btime and grace time updating failure when we use
rsvspace(e.g. ext4 dalloc feature).

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-08-07 16:51:28 +02:00
Trond Myklebust
bfab281721 NFSv4: Cleanup setting of the migration flags.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-07 09:32:55 -04:00
Trond Myklebust
937e3133cd NFSv4.1: Ensure we clear the SP4_MACH_CRED flags in nfs4_sp4_select_mode()
If the server changes, so that it no longer supports SP4_MACH_CRED, or
that it doesn't support the same set of SP4_MACH_CRED functionality,
then we want to ensure that we clear the unsupported flags.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-07 09:32:55 -04:00
Trond Myklebust
9c760d1fd5 NFSv4: Refactor _nfs4_proc_exchange_id()
Tease apart the functionality in nfs4_exchange_id_done() so that
it is easier to debug exchange id vs trunking issues by moving
all the processing out of nfs4_exchange_id_done() and into the
callers.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-07 09:32:55 -04:00
Linus Torvalds
ed66da1104 Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
 "A large number of ext4 bug fixes and cleanups for v4.13"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix copy paste error in ext4_swap_extents()
  ext4: fix overflow caused by missing cast in ext4_resize_fs()
  ext4, project: expand inode extra size if possible
  ext4: cleanup ext4_expand_extra_isize_ea()
  ext4: restructure ext4_expand_extra_isize
  ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize
  ext4: make xattr inode reads faster
  ext4: inplace xattr block update fails to deduplicate blocks
  ext4: remove unused mode parameter
  ext4: fix warning about stack corruption
  ext4: fix dir_nlink behaviour
  ext4: silence array overflow warning
  ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize
  ext4: release discard bio after sending discard commands
  ext4: convert swap_inode_data() over to use swap() on most of the fields
  ext4: error should be cleared if ea_inode isn't added to the cache
  ext4: Don't clear SGID when inheriting ACLs
  ext4: preserve i_mode if __ext4_set_acl() fails
  ext4: remove unused metadata accounting variables
  ext4: correct comment references to ext4_ext_direct_IO()
2017-08-06 12:31:17 -07:00
Maninder Singh
4e56201321 ext4: fix copy paste error in ext4_swap_extents()
This bug was found by a static code checker tool for copy paste
problems.

Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-06 01:33:07 -04:00
Jerry Lee
aec51758ce ext4: fix overflow caused by missing cast in ext4_resize_fs()
On a 32-bit platform, the value of n_blcoks_count may be wrong during
the file system is resized to size larger than 2^32 blocks.  This may
caused the superblock being corrupted with zero blocks count.

Fixes: 1c6bd7173d
Signed-off-by: Jerry Lee <jerrylee@qnap.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org # 3.7+
2017-08-06 01:18:31 -04:00
Miao Xie
c03b45b853 ext4, project: expand inode extra size if possible
When upgrading from old format, try to set project id
to old file first time, it will return EOVERFLOW, but if
that file is dirtied(touch etc), changing project id will
be allowed, this might be confusing for users, we could
try to expand @i_extra_isize here too.

Reported-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-06 01:00:49 -04:00
Miao Xie
b640b2c51b ext4: cleanup ext4_expand_extra_isize_ea()
Clean up some goto statement, make ext4_expand_extra_isize_ea() clearer.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06 00:55:48 -04:00
Miao Xie
cf0a5e818f ext4: restructure ext4_expand_extra_isize
Current ext4_expand_extra_isize just tries to expand extra isize, if
someone is holding xattr lock or some check fails, it will give up.
So rename its name to ext4_try_to_expand_extra_isize.

Besides that, we clean up unnecessary check and move some relative checks
into it.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06 00:40:01 -04:00
Miao Xie
3b10fdc6d8 ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize
We should avoid the contention between the i_extra_isize update and
the inline data insertion, so move the xattr trylock in front of
i_extra_isize update.

Signed-off-by: Miao Xie <miaoxie@huawei.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06 00:27:38 -04:00