Commit Graph

41956 Commits

Author SHA1 Message Date
Miklos Szeredi
a78d9f0d5d ovl: support multiple lower layers
Allow "lowerdir=" option to contain multiple lower directories separated by
a colon (e.g. "lowerdir=/bin:/usr/bin").  Colon characters in filenames can
be escaped with a backslash.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:52 +01:00
Miklos Szeredi
53a08cb9b8 ovl: make upperdir optional
Make "upperdir=" mount option optional.  If "upperdir=" is not given, then
the "workdir=" option is also optional (and ignored if given).

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:51 +01:00
Miklos Szeredi
ab508822ca ovl: improve mount helpers
Move common checks into ovl_mount_dir() helper.

Create helper for looking up lower directories.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:49 +01:00
Miklos Szeredi
3b7a9a249a ovl: mount: change order of initialization
Move allocation of root entry above to where it's needed.

Move initializations related to upperdir and workdir near each other.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:48 +01:00
Miklos Szeredi
4ebc581828 ovl: allow statfs if no upper layer
Handle "no upper layer" case in statfs.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:46 +01:00
Miklos Szeredi
09e10322b7 ovl: lookup ENAMETOOLONG on lower means ENOENT
"Suppose you have in one of the lower layers a filesystem with
->lookup()-enforced upper limit on name length.  Pretty much every local fs
has one, but... they are not all equal.  255 characters is the common upper
limit, but e.g. jffs2 stops at 254, minixfs upper limit is somewhere from
14 to 60, depending upon version, etc.  You are doing a lookup for
something that is present in upper layer, but happens to be too long for
one of the lower layers.  Too bad - ENAMETOOLONG for you..."

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:45 +01:00
Miklos Szeredi
3e01cee3b9 ovl: check whiteout on lowest layer as well
Not checking whiteouts on lowest layer was an optimization (there's nothing
to white out there), but it could result in inconsitent behavior when a
layer previously used as upper/middle is later used as lowest. 

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:45 +01:00
Miklos Szeredi
3d3c6b8939 ovl: multi-layer lookup
Look up dentry in all relevant layers.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:44 +01:00
Miklos Szeredi
9d7459d834 ovl: multi-layer readdir
If multiple lower layers exist, merge them as well in readdir according to
the same rules as merging upper with lower.  I.e. take whiteouts and opaque
directories into account on all but the lowers layer.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:44 +01:00
Miklos Szeredi
5ef88da56a ovl: helper to iterate layers
Add helper to iterate through all the layers, starting from the upper layer
(if exists) and continuing down through the lower layers.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:43 +01:00
Miklos Szeredi
dd662667e6 ovl: add mutli-layer infrastructure
Add multiple lower layers to 'struct ovl_fs' and 'struct ovl_entry'.

ovl_entry will have an array of paths, instead of just the dentry.  This
allows a compact array containing just the layers which exist at current
point in the tree (which is expected to be a small number for the majority
of dentries).

The number of layers is not limited by this infrastructure.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:43 +01:00
Miklos Szeredi
263b4a0fee ovl: dont replace opaque dir
When removing an empty opaque directory, then it makes no sense to replace
it with an exact replica of itself before removal.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:43 +01:00
Miklos Szeredi
1afaba1ecb ovl: make path-type a bitmap
OVL_PATH_PURE_UPPER -> __OVL_PATH_UPPER | __OVL_PATH_PURE
OVL_PATH_UPPER      -> __OVL_PATH_UPPER
OVL_PATH_MERGE      -> __OVL_PATH_UPPER | __OVL_PATH_MERGE
OVL_PATH_LOWER      -> 0

Multiple R/O layers will allow __OVL_PATH_MERGE without __OVL_PATH_UPPER.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:42 +01:00
Miklos Szeredi
49c21e1cac ovl: check whiteout while reading directory
Don't make a separate pass for checking whiteouts, since we can do it while
reading the upper directory.

This will make it easier to handle multiple layers.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-13 00:59:42 +01:00
Jiri Slaby
fa0c554073 reiserfs: destroy allocated commit workqueue
When resirefs is trying to mount a partition, it creates a commit
workqueue (sbi->commit_wq). But when mount fails later, the workqueue
is not freed.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: auxsvr@gmail.com
Reported-by: Benoît Monin <benoit.monin@gmx.fr>
Cc: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org # >= 3.16
Cc: reiserfs-devel@vger.kernel.org
Fixes: 797d9016ce
Signed-off-by: Jan Kara <jack@suse.cz>
2014-12-12 22:18:07 +01:00
Linus Torvalds
6ce4436c9c Merge tag 'please-pull-morepstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull pstore update #2 from Tony Luck:
 "Couple of pstore-ram enhancements to allow use of different memory
  attributes"

* tag 'please-pull-morepstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  pstore-ram: Allow optional mapping with pgprot_noncached
  pstore-ram: Fix hangs by using write-combine mappings
2014-12-12 11:34:13 -08:00
Linus Torvalds
bdeb03cada Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs update from Chris Mason:
 "From a feature point of view, most of the code here comes from Miao
  Xie and others at Fujitsu to implement scrubbing and replacing devices
  on raid56.  This has been in development for a while, and it's a big
  improvement.

  Filipe and Josef have a great assortment of fixes, many of which solve
  problems corruptions either after a crash or in error conditions.  I
  still have a round two from Filipe for next week that solves
  corruptions with discard and block group removal"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (62 commits)
  Btrfs: make get_caching_control unconditionally return the ctl
  Btrfs: fix unprotected deletion from pending_chunks list
  Btrfs: fix fs mapping extent map leak
  Btrfs: fix memory leak after block remove + trimming
  Btrfs: make btrfs_abort_transaction consider existence of new block groups
  Btrfs: fix race between writing free space cache and trimming
  Btrfs: fix race between fs trimming and block group remove/allocation
  Btrfs, replace: enable dev-replace for raid56
  Btrfs: fix freeing used extents after removing empty block group
  Btrfs: fix crash caused by block group removal
  Btrfs: fix invalid block group rbtree access after bg is removed
  Btrfs, raid56: fix use-after-free problem in the final device replace procedure on raid56
  Btrfs, replace: write raid56 parity into the replace target device
  Btrfs, replace: write dirty pages into the replace target device
  Btrfs, raid56: support parity scrub on raid56
  Btrfs, raid56: use a variant to record the operation type
  Btrfs, scrub: repair the common data on RAID5/6 if it is corrupted
  Btrfs, raid56: don't change bbio and raid_map
  Btrfs: remove unnecessary code of stripe_index assignment in __btrfs_map_block
  Btrfs: remove noused bbio_ret in __btrfs_map_block in condition
  ...
2014-12-12 11:15:23 -08:00
Linus Torvalds
a7cb7bb664 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree update from Jiri Kosina:
 "Usual stuff: documentation updates, printk() fixes, etc"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits)
  intel_ips: fix a type in error message
  cpufreq: cpufreq-dt: Move newline to end of error message
  ps3rom: fix error return code
  treewide: fix typo in printk and Kconfig
  ARM: dts: bcm63138: change "interupts" to "interrupts"
  Replace mentions of "list_struct" to "list_head"
  kernel: trace: fix printk message
  scsi: mpt2sas: fix ioctl in comment
  zbud, zswap: change module author email
  clocksource: Fix 'clcoksource' typo in comment
  arm: fix wording of "Crotex" in CONFIG_ARCH_EXYNOS3 help
  gpio: msm-v1: make boolean argument more obvious
  usb: Fix typo in usb-serial-simple.c
  PCI: Fix comment typo 'COMFIG_PM_OPS'
  powerpc: Fix comment typo 'CONIFG_8xx'
  powerpc: Fix comment typos 'CONFiG_ALTIVEC'
  clk: st: Spelling s/stucture/structure/
  isci: Spelling s/stucture/structure/
  usb: gadget: zero: Spelling s/infrastucture/infrastructure/
  treewide: Fix company name in module descriptions
  ...
2014-12-12 10:08:06 -08:00
Linus Torvalds
ccb5a4910d Merge tag 'upstream-3.19-rc1' of git://git.infradead.org/linux-ubifs
Pull UBI/UBIFS updates from Artem Bityutskiy:
 "This includes the following UBI/UBIFS changes:
   - UBI debug messages now include the UBI device number.  This change
     is responsible for the big diffstat since it touched every
     debugging print statement.
   - An Xattr bug-fix which fixes SELinux support
   - Several error path fixes in UBI/UBIFS"

* tag 'upstream-3.19-rc1' of git://git.infradead.org/linux-ubifs:
  UBI: Fix invalid vfree()
  UBI: Fix double free after do_sync_erase()
  UBIFS: fix a couple bugs in UBIFS xattr length calculation
  UBI: vtbl: Use ubi_eba_atomic_leb_change()
  UBI: Extend UBI layer debug/messaging capabilities
  UBIFS: fix budget leak in error path
2014-12-12 09:57:22 -08:00
Linus Torvalds
c05e14f7b3 Merge tag 'xfs-for-linus-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull xfs update from Dave Chinner:
 "There's relatively little change in this update; it is mainly bug
  fixes, cleanups and more of the on-going libxfs restructuring and
  on-disk format header consolidation work.

  Details:
   - more on-disk format header consolidation
   - move some structures shared with userspace to libxfs
   - new per-mount workqueue to fix for deadlocks between nested loop
     mounted filesystems
   - various bug fixes for ENOSPC, stats, quota off and preallocation
   - a bunch of compiler warning fixes for set-but-unused variables
   - various code cleanups"

* tag 'xfs-for-linus-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (24 commits)
  xfs: split metadata and log buffer completion to separate workqueues
  xfs: fix set-but-unused warnings
  xfs: move type conversion functions to xfs_dir.h
  xfs: move ftype conversion functions to libxfs
  xfs: lobotomise xfs_trans_read_buf_map()
  xfs: active inodes stat is broken
  xfs: cleanup xfs_bmse_merge returns
  xfs: cleanup xfs_bmse_shift_one goto mess
  xfs: fix premature enospc on inode allocation
  xfs: overflow in xfs_iomap_eof_align_last_fsb
  xfs: fix simple_return.cocci warning in xfs_bmse_shift_one
  xfs: fix simple_return.cocci warning in xfs_file_readdir
  libxfs: fix simple_return.cocci warnings
  xfs: remove unnecessary null checks
  xfs: merge xfs_inum.h into xfs_format.h
  xfs: move most of xfs_sb.h to xfs_format.h
  xfs: merge xfs_ag.h into xfs_format.h
  xfs: move acl structures to xfs_format.h
  xfs: merge xfs_dinode.h into xfs_format.h
  xfs: catch invalid negative blknos in _xfs_buf_find()
  ...
2014-12-12 09:48:17 -08:00
Linus Torvalds
9bfccec24e Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
 "Lots of bugs fixes, including Zheng and Jan's extent status shrinker
  fixes, which should improve CPU utilization and potential soft lockups
  under heavy memory pressure, and Eric Whitney's bigalloc fixes"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (26 commits)
  ext4: ext4_da_convert_inline_data_to_extent drop locked page after error
  ext4: fix suboptimal seek_{data,hole} extents traversial
  ext4: ext4_inline_data_fiemap should respect callers argument
  ext4: prevent fsreentrance deadlock for inline_data
  ext4: forbid journal_async_commit in data=ordered mode
  jbd2: remove unnecessary NULL check before iput()
  ext4: Remove an unnecessary check for NULL before iput()
  ext4: remove unneeded code in ext4_unlink
  ext4: don't count external journal blocks as overhead
  ext4: remove never taken branch from ext4_ext_shift_path_extents()
  ext4: create nojournal_checksum mount option
  ext4: update comments regarding ext4_delete_inode()
  ext4: cleanup GFP flags inside resize path
  ext4: introduce aging to extent status tree
  ext4: cleanup flag definitions for extent status tree
  ext4: limit number of scanned extents in status tree shrinker
  ext4: move handling of list of shrinkable inodes into extent status code
  ext4: change LRU to round-robin in extent status tree shrinker
  ext4: cache extent hole in extent status tree for ext4_da_map_blocks()
  ext4: fix block reservation for bigalloc filesystems
  ...
2014-12-12 09:28:03 -08:00
David Sterba
ce3e69847e btrfs: sink parameter len to alloc_extent_buffer
Because we're using globally known nodesize. Do the same for the sanity
test function variant.

Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:26:57 +01:00
David Sterba
3f556f7853 btrfs: unify extent buffer allocation api
Make the extent buffer allocation interface consistent.  Cloned eb will
set a valid fs_info.  For dummy eb, we can drop the length parameter and
set it from fs_info.

The built-in sanity checks may pass a NULL fs_info that's queried for
nodesize, but we know it's 4096.

Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:26:55 +01:00
David Sterba
23d79d81b1 btrfs: use GFP_NOFS in __alloc_extent_buffer directly
Same mask from all callers.

Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:07:23 +01:00
David Sterba
7476dfdaad btrfs: sink blocksize parameter to tree_block_processed
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:07:22 +01:00
David Sterba
a83fffb75d btrfs: sink blocksize parameter to btrfs_find_create_tree_block
Finally it's clear that the requested blocksize is always equal to
nodesize, with one exception, the superblock.

Superblock has fixed size regardless of the metadata block size, but
uses the same helpers to initialize sys array/chunk tree and to work
with the chunk items. So it pretends to be an extent_buffer for a
moment, btrfs_read_sys_array is full of special cases, we're adding one
more.

Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:07:21 +01:00
David Sterba
fe864576de btrfs: sink blocksize parameter to btrfs_init_new_buffer
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:07:20 +01:00
David Sterba
c0dcaa4d7b btrfs: sink blocksize parameter to reada_tree_block_flagged
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:07:20 +01:00
David Sterba
b6ae40ec76 btrfs: remove blocksize from reada_extent
Replace with global nodesize instead.

Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:07:19 +01:00
David Sterba
d3e46fea1b btrfs: sink blocksize parameter to readahead_tree_block
All callers pass nodesize.

Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-12 18:07:18 +01:00
Miklos Szeredi
1c68271cf1 fuse: use file_inode() in fuse_file_fallocate()
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-12 10:04:51 +01:00
Miklos Szeredi
7078187a79 fuse: introduce fuse_simple_request() helper
The following pattern is repeated many times:

	req = fuse_get_req_nopages(fc);
	/* Initialize req->(in|out).args */
	fuse_request_send(fc, req);
	err = req->out.h.error;
	fuse_put_request(req);

Create a new replacement helper:

	/* Initialize args */
	err = fuse_simple_request(fc, &args);

In addition to reducing the code size, this will ease moving from the
complex arg-based to a simpler page-based I/O on the fuse device.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-12 09:49:05 +01:00
Miklos Szeredi
f704dcb538 fuse: reduce max out args
The third out-arg is never actually used.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-12 09:49:05 +01:00
Miklos Szeredi
baebccbe99 fuse: hold inode instead of path after release
path_put() in release could trigger a DESTROY request in fuseblk.  The
possible deadlock was worked around by doing the path_put() with
schedule_work().

This complexity isn't needed if we just hold the inode instead of the path.
Since we now flush all requests before destroying the super block we can be
sure that all held inodes will be dropped.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-12 09:49:04 +01:00
Miklos Szeredi
580640ba5d fuse: flush requests on umount
Use fuse_abort_conn() instead of fuse_conn_kill() in fuse_put_super().
This flushes and aborts requests still on any queues.  But since we've
already reset fc->connected, those requests would not be useful anyway and
would be flushed when the fuse device is closed.

Next patches will rely on requests being flushed before the superblock is
destroyed.

Use fuse_abort_conn() in cuse_process_init_reply() too, since it makes no
difference there, and we can get rid of fuse_conn_kill().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-12 09:49:04 +01:00
Miklos Szeredi
0c4dd4ba14 fuse: don't wake up reserved req in fuse_conn_kill()
Waking up reserved_req_waitq from fuse_conn_kill() doesn't make sense since
we aren't chaging ff->reserved_req here, which is what this waitqueue
signals.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2014-12-12 09:49:04 +01:00
Linus Torvalds
c0222ac086 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 "This is an unusually large pull request for MIPS - in parts because
  lots of patches missed the 3.18 deadline but primarily because some
  folks opened the flood gates.

   - Retire the MIPS-specific phys_t with the generic phys_addr_t.
   - Improvments for the backtrace code used by oprofile.
   - Better backtraces on SMP systems.
   - Cleanups for the Octeon platform code.
   - Cleanups and fixes for the Loongson platform code.
   - Cleanups and fixes to the firmware library.
   - Switch ATH79 platform to use the firmware library.
   - Grand overhault to the SEAD3 and Malta interrupt code.
   - Move the GIC interrupt code to drivers/irqchip
   - Lots of GIC cleanups and updates to the GIC code to use modern IRQ
     infrastructures and features of the kernel.
   - OF documentation updates for the GIC bindings
   - Move GIC clocksource driver to drivers/clocksource
   - Merge GIC clocksource driver with clockevent driver.
   - Further updates to bring the GIC clocksource driver up to date.
   - R3000 TLB code cleanups
   - Improvments to the Loongson 3 platform code.
   - Convert pr_warning to pr_warn.
   - Merge a bunch of small lantiq and ralink fixes that have been
     staged/lingering inside the openwrt tree for a while.
   - Update archhelp for IP22/IP32
   - Fix a number of issues for Loongson 1B.
   - New clocksource and clockevent driver for Loongson 1B.
   - Further work on clk handling for Loongson 1B.
   - Platform work for Broadcom BMIPS.
   - Error handling cleanups for TurboChannel.
   - Fixes and optimization to the microMIPS support.
   - Option to disable the FTLB.
   - Dump more relevant information on machine check exception
   - Change binfmt to allow arch to examine PT_*PROC headers
   - Support for new style FPU register model in O32
   - VDSO randomization.
   - BCM47xx cleanups
   - BCM47xx reimplement the way the kernel accesses NVRAM information.
   - Random cleanups
   - Add support for ATH25 platforms
   - Remove pointless locking code in some PCI platforms.
   - Some improvments to EVA support
   - Minor Alchemy cleanup"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (185 commits)
  MIPS: Add MFHC0 and MTHC0 instructions to uasm.
  MIPS: Cosmetic cleanups of page table headers.
  MIPS: Add CP0 macros for extended EntryLo registers
  MIPS: Remove now unused definition of phys_t.
  MIPS: Replace use of phys_t with phys_addr_t.
  MIPS: Replace MIPS-specific 64BIT_PHYS_ADDR with generic PHYS_ADDR_T_64BIT
  PCMCIA: Alchemy Don't select 64BIT_PHYS_ADDR in Kconfig.
  MIPS: lib: memset: Clean up some MIPS{EL,EB} ifdefery
  MIPS: iomap: Use __mem_{read,write}{b,w,l} for MMIO
  MIPS: <asm/types.h> fix indentation.
  MAINTAINERS: Add entry for BMIPS multiplatform kernel
  MIPS: Enable VDSO randomization
  MIPS: Remove a temporary hack for debugging cache flushes in SMTC configuration
  MIPS: Remove declaration of obsolete arch_init_clk_ops()
  MIPS: atomic.h: Reformat to fit in 79 columns
  MIPS: Apply `.insn' to fixup labels throughout
  MIPS: Fix microMIPS LL/SC immediate offsets
  MIPS: Kconfig: Only allow 32-bit microMIPS builds
  MIPS: signal.c: Fix an invalid cast in ISA mode bit handling
  MIPS: mm: Only build one microassembler that is suitable
  ...
2014-12-11 17:56:37 -08:00
Eric W. Biederman
9cc46516dd userns: Add a knob to disable setgroups on a per user namespace basis
- Expose the knob to user space through a proc file /proc/<pid>/setgroups

  A value of "deny" means the setgroups system call is disabled in the
  current processes user namespace and can not be enabled in the
  future in this user namespace.

  A value of "allow" means the segtoups system call is enabled.

- Descendant user namespaces inherit the value of setgroups from
  their parents.

- A proc file is used (instead of a sysctl) as sysctls currently do
  not allow checking the permissions at open time.

- Writing to the proc file is restricted to before the gid_map
  for the user namespace is set.

  This ensures that disabling setgroups at a user namespace
  level will never remove the ability to call setgroups
  from a process that already has that ability.

  A process may opt in to the setgroups disable for itself by
  creating, entering and configuring a user namespace or by calling
  setns on an existing user namespace with setgroups disabled.
  Processes without privileges already can not call setgroups so this
  is a noop.  Prodcess with privilege become processes without
  privilege when entering a user namespace and as with any other path
  to dropping privilege they would not have the ability to call
  setgroups.  So this remains within the bounds of what is possible
  without a knob to disable setgroups permanently in a user namespace.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2014-12-11 18:06:36 -06:00
Linus Torvalds
70e71ca0af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

 2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers.  Thanks to Al Viro
    and Herbert Xu.

 3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

 4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

 5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

 6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

 7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

 8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

 9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets.  From Alexei
    Starovoitov.

10) Support TSO/LSO in sunvnet driver, from David L Stevens.

11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

12) Remote checksum offload, from Tom Herbert.

13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

14) Add MPLS support to openvswitch, from Simon Horman.

15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet.  This tries to resolve the conflicting goals between the
    desired handling of bulk vs.  RPC-like traffic.

17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU.  From Eric Dumazet.

18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

21) Add VLAN packet scheduler action, from Jiri Pirko.

22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
  Fix race condition between vxlan_sock_add and vxlan_sock_release
  net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
  net/mlx4: Add support for A0 steering
  net/mlx4: Refactor QUERY_PORT
  net/mlx4_core: Add explicit error message when rule doesn't meet configuration
  net/mlx4: Add A0 hybrid steering
  net/mlx4: Add mlx4_bitmap zone allocator
  net/mlx4: Add a check if there are too many reserved QPs
  net/mlx4: Change QP allocation scheme
  net/mlx4_core: Use tasklet for user-space CQ completion events
  net/mlx4_core: Mask out host side virtualization features for guests
  net/mlx4_en: Set csum level for encapsulated packets
  be2net: Export tunnel offloads only when a VxLAN tunnel is created
  gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
  cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
  net: fec: only enable mdio interrupt before phy device link up
  net: fec: clear all interrupt events to support i.MX6SX
  net: fec: reset fep link status in suspend function
  net: sock: fix access via invalid file descriptor
  net: introduce helper macro for_each_cmsghdr
  ...
2014-12-11 14:27:06 -08:00
Tony Lindgren
027bc8b082 pstore-ram: Allow optional mapping with pgprot_noncached
On some ARMs the memory can be mapped pgprot_noncached() and still
be working for atomic operations. As pointed out by Colin Cross
<ccross@android.com>, in some cases you do want to use
pgprot_noncached() if the SoC supports it to see a debug printk
just before a write hanging the system.

On ARMs, the atomic operations on strongly ordered memory are
implementation defined. So let's provide an optional kernel parameter
for configuring pgprot_noncached(), and use pgprot_writecombine() by
default.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rob Herring <robherring2@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: stable@vger.kernel.org
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-12-11 13:38:31 -08:00
Rob Herring
7ae9cb8193 pstore-ram: Fix hangs by using write-combine mappings
Currently trying to use pstore on at least ARMs can hang as we're
mapping the peristent RAM with pgprot_noncached().

On ARMs, pgprot_noncached() will actually make the memory strongly
ordered, and as the atomic operations pstore uses are implementation
defined for strongly ordered memory, they may not work. So basically
atomic operations have undefined behavior on ARM for device or strongly
ordered memory types.

Let's fix the issue by using write-combine variants for mappings. This
corresponds to normal, non-cacheable memory on ARM. For many other
architectures, this change does not change the mapping type as by
default we have:

#define pgprot_writecombine pgprot_noncached

The reason why pgprot_noncached() was originaly used for pstore
is because Colin Cross <ccross@android.com> had observed lost
debug prints right before a device hanging write operation on some
systems. For the platforms supporting pgprot_noncached(), we can
add a an optional configuration option to support that. But let's
get pstore working first before adding new features.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Colin Cross <ccross@android.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
[tony@atomide.com: updated description]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-12-11 13:35:49 -08:00
Al Viro
93fe74b2e2 coda_venus_readdir(): use file_inode()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-11 16:28:12 -05:00
Al Viro
d465887f9d fs/namei.c: fold link_path_walk() call into path_init()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-11 16:27:57 -05:00
Al Viro
980f3ea2f6 path_init(): don't bother with LOOKUP_PARENT in argument
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-11 16:27:57 -05:00
Al Viro
893b7775a7 fs/namei.c: new helper (path_cleanup())
All callers of path_init() proceed to do the identical cleanup when
they are done with nameidata.  Don't open-code it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-11 16:27:57 -05:00
Al Viro
5e53084d77 path_init(): store the "base" pointer to file in nameidata itself
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-11 16:27:57 -05:00
Linus Torvalds
b6da0076ba Merge branch 'akpm' (patchbomb from Andrew)
Merge first patchbomb from Andrew Morton:
 - a few minor cifs fixes
 - dma-debug upadtes
 - ocfs2
 - slab
 - about half of MM
 - procfs
 - kernel/exit.c
 - panic.c tweaks
 - printk upates
 - lib/ updates
 - checkpatch updates
 - fs/binfmt updates
 - the drivers/rtc tree
 - nilfs
 - kmod fixes
 - more kernel/exit.c
 - various other misc tweaks and fixes

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
  exit: pidns: fix/update the comments in zap_pid_ns_processes()
  exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
  exit: exit_notify: re-use "dead" list to autoreap current
  exit: reparent: call forget_original_parent() under tasklist_lock
  exit: reparent: avoid find_new_reaper() if no children
  exit: reparent: introduce find_alive_thread()
  exit: reparent: introduce find_child_reaper()
  exit: reparent: document the ->has_child_subreaper checks
  exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper()
  exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting
  exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
  exit: proc: don't try to flush /proc/tgid/task/tgid
  exit: release_task: fix the comment about group leader accounting
  exit: wait: drop tasklist_lock before psig->c* accounting
  exit: wait: don't use zombie->real_parent
  exit: wait: cleanup the ptrace_reparented() checks
  usermodehelper: kill the kmod_thread_locker logic
  usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper()
  fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp
  nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races
  ...
2014-12-10 18:34:42 -08:00
Al Viro
bd9b51e79c make default ->i_fop have ->open() fail with ENXIO
As it is, default ->i_fop has NULL ->open() (along with all other methods).
The only case where it matters is reopening (via procfs symlink) a file that
didn't get its ->f_op from ->i_fop - anything else will have ->i_fop assigned
to something sane (default would fail on read/write/ioctl/etc.).

	Unfortunately, such case exists - alloc_file() users, especially
anon_get_file() ones.  There we have tons of opened files of very different
kinds sharing the same inode.  As the result, attempt to reopen those via
procfs succeeds and you get a descriptor you can't do anything with.

	Moreover, in case of sockets we set ->i_fop that will only be used
on such reopen attempts - and put a failing ->open() into it to make sure
those do not succeed.

	It would be simpler to put such ->open() into default ->i_fop and leave
it unchanged both for anon inode (as we do anyway) and for socket ones.  Result:
	* everything going through do_dentry_open() works as it used to
	* sock_no_open() kludge is gone
	* attempts to reopen anon-inode files fail as they really ought to
	* ditto for aio_private_file()
	* ditto for perfmon - this one actually tried to imitate sock_no_open()
trick, but failed to set ->i_fop, so in the current tree reopens succeed and
yield completely useless descriptor.  Intent clearly had been to fail with
-ENXIO on such reopens; now it actually does.
	* everything else that used alloc_file() keeps working - it has ->i_fop
set for its inodes anyway

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-10 21:32:15 -05:00
Al Viro
1f55a6ec94 make nameidata completely opaque outside of fs/namei.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-10 21:32:13 -05:00
Al Viro
707c5960f1 Merge branch 'nsfs' into for-next 2014-12-10 21:31:59 -05:00