Commit Graph

886869 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
7624e69465 perf map: Move seldom used ->flags field to second cacheline
So we start with:

  $ pahole -C map ~/bin/perf
  struct map {
  	union {
  		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
  		struct list_head node;                   /*     0    16 */
  	} __attribute__((__aligned__(8)));                                               /*     0    24 */
  	u64                        start;                /*    24     8 */
  	u64                        end;                  /*    32     8 */
  	_Bool                      erange_warned:1;      /*    40: 0  1 */
  	_Bool                      priv:1;               /*    40: 1  1 */

  	/* XXX 6 bits hole, try to pack */
  	/* XXX 3 bytes hole, try to pack */

  	u32                        prot;                 /*    44     4 */
  	u32                        flags;                /*    48     4 */

  	/* XXX 4 bytes hole, try to pack */

  	u64                        pgoff;                /*    56     8 */
  	/* --- cacheline 1 boundary (64 bytes) --- */
  	u64                        reloc;                /*    64     8 */
  	u32                        maj;                  /*    72     4 */
  	u32                        min;                  /*    76     4 */
  	u64                        ino;                  /*    80     8 */
  	u64                        ino_generation;       /*    88     8 */
  	u64                        (*map_ip)(struct map *, u64); /*    96     8 */
  	u64                        (*unmap_ip)(struct map *, u64); /*   104     8 */
  	struct dso *               dso;                  /*   112     8 */
  	refcount_t                 refcnt;               /*   120     4 */

  	/* size: 128, cachelines: 2, members: 17 */
  	/* sum members: 116, holes: 2, sum holes: 7 */
  	/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
  	/* padding: 4 */
  	/* forced alignments: 1 */
  } __attribute__((__aligned__(8)));
  $

and 'flags' is seldom used when printing details about the map or with
the "cacheline" sort order, we can move them it to the second cacheline,
that will allow combining it with 'refcnt', that is only four bytes:

  $ pahole -C map ~/bin/perf
  struct map {
  	union {
  		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
  		struct list_head node;                   /*     0    16 */
  	} __attribute__((__aligned__(8)));                                               /*     0    24 */
  	u64                        start;                /*    24     8 */
  	u64                        end;                  /*    32     8 */
  	_Bool                      erange_warned:1;      /*    40: 0  1 */
  	_Bool                      priv:1;               /*    40: 1  1 */

  	/* XXX 6 bits hole, try to pack */
  	/* XXX 3 bytes hole, try to pack */

  	u32                        prot;                 /*    44     4 */
  	u64                        pgoff;                /*    48     8 */
  	u64                        reloc;                /*    56     8 */
  	/* --- cacheline 1 boundary (64 bytes) --- */
  	u32                        maj;                  /*    64     4 */
  	u32                        min;                  /*    68     4 */
  	u64                        ino;                  /*    72     8 */
  	u64                        ino_generation;       /*    80     8 */
  	u64                        (*map_ip)(struct map *, u64); /*    88     8 */
  	u64                        (*unmap_ip)(struct map *, u64); /*    96     8 */
  	struct dso *               dso;                  /*   104     8 */
  	refcount_t                 refcnt;               /*   112     4 */
  	u32                        flags;                /*   116     4 */

  	/* size: 120, cachelines: 2, members: 17 */
  	/* sum members: 116, holes: 1, sum holes: 3 */
  	/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
  	/* forced alignments: 1 */
  	/* last cacheline: 56 bytes */
  } __attribute__((__aligned__(8)));
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-2cdw3zlw1mkamaf7nqtdlxfi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18 16:51:00 -03:00
Jonathan Corbet
5ca470a0c3 docs: Add request_irq() documentation
While checking the results of the :c:func: removal, I noticed that there
was no documentation for request_irq(), and request_threaded_irq() was not
mentioned at all.  Add a kerneldoc comment for request_irq() and add
request_threaded_irq() to the list of functions.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-18 12:40:59 -07:00
Arnaldo Carvalho de Melo
dbc984c961 perf map: Use bitmap for booleans
The map->priv and map->erange_warned are seldom used, the first only in
tests/vmlinux-kallsyms.c, the later only when hist_entry__inc_addr_samples()
returns -ERANGE in 'perf top', which are really rare occasions, so make
them a bool bitfield.

This will open up space for other members on the first cacheline.

  $ pahole -C map ~/bin/perf
  struct map {
  	union {
  		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
  		struct list_head node;                   /*     0    16 */
  	} __attribute__((__aligned__(8)));                                               /*     0    24 */
  	u64                        start;                /*    24     8 */
  	u64                        end;                  /*    32     8 */
  	_Bool                      erange_warned:1;      /*    40: 0  1 */
  	_Bool                      priv:1;               /*    40: 1  1 */

  	/* XXX 6 bits hole, try to pack */
  	/* XXX 3 bytes hole, try to pack */

  	u32                        prot;                 /*    44     4 */
  	u32                        flags;                /*    48     4 */

  	/* XXX 4 bytes hole, try to pack */

  	u64                        pgoff;                /*    56     8 */
  	/* --- cacheline 1 boundary (64 bytes) --- */
  	u64                        reloc;                /*    64     8 */
  	u32                        maj;                  /*    72     4 */
  	u32                        min;                  /*    76     4 */
  	u64                        ino;                  /*    80     8 */
  	u64                        ino_generation;       /*    88     8 */
  	u64                        (*map_ip)(struct map *, u64); /*    96     8 */
  	u64                        (*unmap_ip)(struct map *, u64); /*   104     8 */
  	struct dso *               dso;                  /*   112     8 */
  	refcount_t                 refcnt;               /*   120     4 */

  	/* size: 128, cachelines: 2, members: 17 */
  	/* sum members: 116, holes: 2, sum holes: 7 */
  	/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
  	/* padding: 4 */
  	/* forced alignments: 1 */
  } __attribute__((__aligned__(8)));
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-g5545pcq4ff0wr17tfb1piqt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18 16:29:01 -03:00
Chris Wilson
c0fa92ec89 drm/i915: Protect request peeking with RCU
Since the execlists_active() is no longer protected by the
engine->active.lock, we need to protect the request pointer with RCU to
prevent it being freed as we evaluate whether or not we need to preempt.

Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191104090158.2959-2-chris@chris-wilson.co.uk
(cherry picked from commit 7d14863525)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 8eb4704b12)
(cherry picked from commit 7e27238e149ce4f00d9cd801fe3aa0ea55e986a2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18 11:25:16 -08:00
Jaskaran Singh
e8a9e30d72 docs: filesystems: Add mount map description in Content
The second paragraph of the content section does not properly
describe how mount points are determined by autofs.

Replace the lines detailing how the determination of these mount
points is "ad hoc" by a short description of the mount map syntax
used by autofs.

Signed-off-by: Jaskaran Singh <jaskaransingh7654321@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-18 12:19:59 -07:00
Jaskaran Singh
c11565e887 docs: filesystems: Update code snippets in autofs.rst
Some of the struct definitions now have an autofs packet header.
Reflect these changes by adding a definition of this header and
place it wherever suitable.

Signed-off-by: Jaskaran Singh <jaskaransingh7654321@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-18 12:19:53 -07:00
Jaskaran Singh
f11f2a3c54 docs: filesystems: convert autofs.txt to reST
Convert autofs.txt to reST.

The following changes abound:

- Introduce reST formatting for headings, lists et al.
- Add an indentation of an 8 space tab wherever suitable, so as
  to maintain consistency.
- Remove indentation of the description of the ioctls which are similar
  to the AUTOFS_IOC ioctls, as it does not come out quite right in HTML.
- Add an entry for autofs in the index.

Signed-off-by: Jaskaran Singh <jaskaransingh7654321@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-18 12:17:17 -07:00
Paul Cercueil
33c26ab4d6 watchdog: jz4740: Drop dependency on MACH_JZ47xx
Depending on MACH_JZ47xx prevent us from creating a generic kernel that
works on more than one MIPS board. Instead, we just depend on MIPS being
set.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191023174714.14362-3-paul@crapouillou.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 20:14:47 +01:00
Paul Cercueil
6d532143c9 watchdog: jz4740: Use regmap provided by TCU driver
Since we broke the ABI by changing the clock, the driver was also
updated to use the regmap provided by the TCU driver.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Tested-by: Mathieu Malaterre <malat@debian.org>
Tested-by: Artur Rojek <contact@artur-rojek.eu>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191023174714.14362-2-paul@crapouillou.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 20:14:46 +01:00
Paul Cercueil
1d9c307454 watchdog: jz4740: Use WDT clock provided by TCU driver
Instead of requesting the "ext" clock and handling the watchdog clock
divider and gating in the watchdog driver, we now request and use the
"wdt" clock that is supplied by the ingenic-timer "TCU" driver.

The major benefit is that the watchdog's clock rate and parent can now
be specified from within devicetree, instead of hardcoded in the driver.

Also, this driver won't poke anymore into the TCU registers to
enable/disable the clock, as this is now handled by the TCU driver.

On the bad side, we break the ABI with devicetree - as we now request a
different clock. In this very specific case it is still okay, as every
Ingenic JZ47xx-based board out there compile the devicetree within the
kernel; so it's still time to push breaking changes, in order to get a
clean devicetree that won't break once it musn't.

Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Tested-by: Mathieu Malaterre <malat@debian.org>
Tested-by: Artur Rojek <contact@artur-rojek.eu>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191023174714.14362-1-paul@crapouillou.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 20:14:46 +01:00
Eugen Hristev
74f078c333 dt-bindings: watchdog: sama5d4_wdt: add microchip,sam9x60-wdt compatible
The Atmel sama5d4_wdt needs to be compatible with microchip,sam9x60-wdt
The sama5d4_wdt driver is updated to work with both hardware blocks
(sama5d4/sama5d2 and sam9x60 based blocks)

Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/1573806579-7981-2-git-send-email-eugen.hristev@microchip.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 20:14:46 +01:00
Eugen Hristev
d615a6fd12 watchdog: sama5d4_wdt: cleanup the bit definitions
Cleanup the macro definitions to use BIT and align with two spaces.

Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/1573806579-7981-1-git-send-email-eugen.hristev@microchip.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 20:14:45 +01:00
Josef Bacik
3e1740993e btrfs: record all roots for rename exchange on a subvol
Testing with the new fsstress support for subvolumes uncovered a pretty
bad problem with rename exchange on subvolumes.  We're modifying two
different subvolumes, but we only start the transaction on one of them,
so the other one is not added to the dirty root list.  This is caught by
btrfs_cow_block() with a warning because the root has not been updated,
however if we do not modify this root again we'll end up pointing at an
invalid root because the root item is never updated.

Fix this by making sure we add the destination root to the trans list,
the same as we do with normal renames.  This fixes the corruption.

Fixes: cdd1fedf82 ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT")
CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 20:08:31 +01:00
Shuiqing Li
39e68d9e7a watchdog: sprd: Fix the incorrect pointer getting from driver data
The device driver data saved the 'struct sprd_wdt' object, it is
incorrect to get 'struct watchdog_device' object from the driver
data, thus fix it.

Fixes: 4776034670 ("watchdog: Add Spreadtrum watchdog driver")
Reported-by: Dongwei Wang <dongwei.wang@unisoc.com>
Signed-off-by: Shuiqing Li <shuiqing.li@unisoc.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/76d4687189ec940baa90cb8d679a8d4c8f02ee80.1573210405.git.baolin.wang@linaro.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:45 +01:00
Joel Stanley
c04571251b watchdog: aspeed: Fix clock behaviour for ast2600
The ast2600 no longer uses bit 4 in the control register to indicate a
1MHz clock (It now controls whether this watchdog is reset by a SOC
reset). This means we do not want to set it. It also does not need to be
set for the ast2500, as it is read-only on that SoC.

The comment next to the clock rate selection wandered away from where it
was set, so put it back next to the register setting it's describing.

Fixes: b3528b4874 ("watchdog: aspeed: Add support for AST2600")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191108032905.22463-1-joel@jms.id.au
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:44 +01:00
Fabio Estevam
6083ab7b2f watchdog: imx7ulp: Fix reboot hang
The following hang is observed when a 'reboot' command is issued:

# reboot
# Stopping network: OK
Stopping klogd: OK
Stopping syslogd: OK
umount: devtmpfs busy - remounted read-only
[    8.612079] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system reboot
[   10.694753] reboot: Restarting system
[   11.699008] Reboot failed -- System halted

Fix this problem by adding a .restart ops member.

Fixes: 41b630f41b ("watchdog: Add i.MX7ULP watchdog support")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191029174037.25381-1-festevam@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:44 +01:00
Rasmus Villemoes
bc44fa734c watchdog: make nowayout sysfs file writable
It can be useful to delay setting the nowayout feature for a watchdog
device. Moreover, not every driver (notably gpio_wdt) implements a
nowayout module parameter/otherwise respects CONFIG_WATCHDOG_NOWAYOUT,
and modifying those drivers carries a risk of causing a regression for
someone who has two watchdog devices, sets CONFIG_WATCHDOG_NOWAYOUT
and somehow relies on the gpio_wdt driver being ignorant of
that (i.e., allowing one to gracefully close a gpio_wdt but not the
other watchdog in the system).

So instead, simply make the nowayout sysfs file writable. Obviously,
setting nowayout is a one-way street.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191105205118.11359-1-linux@rasmusvillemoes.dk
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:43 +01:00
Julia Cartwright
a19f89335f watchdog: prevent deferral of watchdogd wakeup on RT
When PREEMPT_RT is enabled, all hrtimer expiry functions are
deferred for execution into the context of ksoftirqd unless otherwise
annotated.

Deferring the expiry of the hrtimer used by the watchdog core, however,
is a waste, as the callback does nothing but queue a kthread work item
and wakeup watchdogd.

It's worst then that, too: the deferral through ksoftirqd also means
that for correct behavior a user must adjust the scheduling parameters
of both watchdogd _and_ ksoftirqd, which is unnecessary and has other
side effects (like causing unrelated expiry functions to execute at
potentially elevated priority).

Instead, mark the hrtimer used by the watchdog core as being _HARD to
allow it's execution directly from hardirq context.  The work done in
this expiry function is well-bounded and minimal.

A user still must adjust the scheduling parameters of the watchdogd
to be correct w.r.t. their application needs.

Link: https://lkml.kernel.org/r/0e02d8327aeca344096c246713033887bc490dd7.1538089180.git.julia@ni.com
Cc: Guenter Roeck <linux@roeck-us.net>
Reported-and-tested-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Reported-by: Tim Sander <tim@krieglstein.org>
Signed-off-by: Julia Cartwright <julia@ni.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
[bigeasy: use only HRTIMER_MODE_REL_HARD]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191105144506.clyadjbvnn7b7b2m@linutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:43 +01:00
Fabio Estevam
eccb7fe5e7 watchdog: imx7ulp: Use definitions instead of magic values
Use definitions instead of magic values in order to improve readability.

Since the CLK field of the WDOG CS register is composed of two bits
to select the watchdog clock source, use a shift representation
instead of BIT().

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191029174037.25381-5-festevam@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:42 +01:00
Fabio Estevam
c37e358163 watchdog: imx7ulp: Remove inline annotations
Compiler is smart enough and knows when to inline, so there
is no need to mark some of these functions as 'inline'.

Remove such annotations.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191029174037.25381-4-festevam@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:42 +01:00
Fabio Estevam
9ab26e34bb watchdog: imx7ulp: Remove unused structure member
The 'notifier_block' structure member is unused, so just remove it.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191029174037.25381-3-festevam@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:42 +01:00
Fabio Estevam
747d88a1a8 watchdog: imx7ulp: Pass the wdog instance inimx7ulp_wdt_enable()
It is more natural to pass the watchdog instance inside
imx7ulp_wdt_enable() instead of the base address.

This also has the benefit to reduce the code a bit.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191029174037.25381-2-festevam@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:41 +01:00
Geert Uytterhoeven
4c0bfc039c watchdog: wdat_wdt: Spelling s/configrable/configurable/
Fix misspelling of "configurable".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191024152856.30788-1-geert+renesas@glider.be
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:41 +01:00
Matti Vaittinen
5b256b5688 watchdog: bd70528: Trivial function documentation fix
The function documentation for the exported  ROHM BD70528 WDG control
functions used old argument names. Fix the names.

Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191010060733.GA9979@localhost.localdomain
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:41 +01:00
Michal Simek
ef14cab2d8 watchdog: cadence: Do not show error in case of deferred probe
There is no reason to show error message if clocks are not ready yet.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/d3e295d5ba79f453b4aa4128c4fa63fbd6c16920.1570544944.git.michal.simek@xilinx.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:40 +01:00
Kevin Hao
72139dfa24 watchdog: Fix the race between the release of watchdog_core_data and cdev
The struct cdev is embedded in the struct watchdog_core_data. In the
current code, we manage the watchdog_core_data with a kref, but the
cdev is manged by a kobject. There is no any relationship between
this kref and kobject. So it is possible that the watchdog_core_data is
freed before the cdev is entirely released. We can easily get the
following call trace with CONFIG_DEBUG_KOBJECT_RELEASE and
CONFIG_DEBUG_OBJECTS_TIMERS enabled.
  ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38
  WARNING: CPU: 23 PID: 1028 at lib/debugobjects.c:481 debug_print_object+0xb0/0xf0
  Modules linked in: softdog(-) deflate ctr twofish_generic twofish_common camellia_generic serpent_generic blowfish_generic blowfish_common cast5_generic cast_common cmac xcbc af_key sch_fq_codel openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
  CPU: 23 PID: 1028 Comm: modprobe Not tainted 5.3.0-next-20190924-yoctodev-standard+ #180
  Hardware name: Marvell OcteonTX CN96XX board (DT)
  pstate: 00400009 (nzcv daif +PAN -UAO)
  pc : debug_print_object+0xb0/0xf0
  lr : debug_print_object+0xb0/0xf0
  sp : ffff80001cbcfc70
  x29: ffff80001cbcfc70 x28: ffff800010ea2128
  x27: ffff800010bad000 x26: 0000000000000000
  x25: ffff80001103c640 x24: ffff80001107b268
  x23: ffff800010bad9e8 x22: ffff800010ea2128
  x21: ffff000bc2c62af8 x20: ffff80001103c600
  x19: ffff800010e867d8 x18: 0000000000000060
  x17: 0000000000000000 x16: 0000000000000000
  x15: ffff000bd7240470 x14: 6e6968207473696c
  x13: 5f72656d6974203a x12: 6570797420746365
  x11: 6a626f2029302065 x10: 7461747320657669
  x9 : 7463612820657669 x8 : 3378302f3078302b
  x7 : 0000000000001d7a x6 : ffff800010fd5889
  x5 : 0000000000000000 x4 : 0000000000000000
  x3 : 0000000000000000 x2 : ffff000bff948548
  x1 : 276a1c9e1edc2300 x0 : 0000000000000000
  Call trace:
   debug_print_object+0xb0/0xf0
   debug_check_no_obj_freed+0x1e8/0x210
   kfree+0x1b8/0x368
   watchdog_cdev_unregister+0x88/0xc8
   watchdog_dev_unregister+0x38/0x48
   watchdog_unregister_device+0xa8/0x100
   softdog_exit+0x18/0xfec4 [softdog]
   __arm64_sys_delete_module+0x174/0x200
   el0_svc_handler+0xd0/0x1c8
   el0_svc+0x8/0xc

This is a common issue when using cdev embedded in a struct.
Fortunately, we already have a mechanism to solve this kind of issue.
Please see commit 233ed09d7f ("chardev: add helper function to
register char devs with a struct device") for more detail.

In this patch, we choose to embed the struct device into the
watchdog_core_data, and use the API provided by the commit 233ed09d7f
to make sure that the release of watchdog_core_data and cdev are
in sequence.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20191008112934.29669-1-haokexin@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:40 +01:00
Borislav Petkov
b6276d4e4b watchdog: sbc7240_wdt: Fix yet another -Wimplicit-fallthrough warning
... by moving the fall through comment outside of the code block so that
gcc sees it.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-watchdog@vger.kernel.org
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20190929114649.22740-1-bp@alien8.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:40 +01:00
Andy Shevchenko
ff0aaacb6f watchdog: intel-mid_wdt: Add WATCHDOG_NOWAYOUT support
Normally, the watchdog is disabled when /dev/watchdog is closed, but if
CONFIG_WATCHDOG_NOWAYOUT is defined, then it means that the watchdog should
remain enabled. So we should keep it enabled if CONFIG_WATCHDOG_NOWAYOUT
is defined.

Reported-by: Razvan Becheriu <razvan.becheriu@qualitance.com>
Cc: Ferry Toth <fntoth@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Razvan Becheriu <razvan.becheriu@gmail.com>
Link: https://lore.kernel.org/r/20190924143116.69823-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:39 +01:00
Anson Huang
86865322d2 watchdog: imx2_wdt: Use 'dev' instead of dereferencing it repeatedly
Add helper variable dev = &pdev->dev to simply the code.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/1569308828-8320-3-git-send-email-Anson.Huang@nxp.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:39 +01:00
Anson Huang
ebe66ded6d watchdog: imx2_wdt: Use __maybe_unused instead of #if CONFIG_PM_SLEEP
Use __maybe_unused for power management related functions instead
of #if CONFIG_PM_SLEEP to simply the code.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/1569308828-8320-1-git-send-email-Anson.Huang@nxp.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:39 +01:00
Anson Huang
87d184680d watchdog: imx2_wdt: Remove unnecessary blank line
Remove unnecessary blank line.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/1569308828-8320-1-git-send-email-Anson.Huang@nxp.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:38 +01:00
Srikanth Krishnakar
e11cfc6976 watchdog: w83627hf_wdt: Support NCT6116D
The watchdog controller on NCT6116D is compatible with NCT6102D.
Extend the support to enable SuperIO based NCT6116D watchdog device.

Signed-off-by: Srikanth Krishnakar <Srikanth_Krishnakar@mentor.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20190918160458.10108-1-Srikanth_Krishnakar@mentor.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2019-11-18 19:53:38 +01:00
Sudip Mukherjee
55ed51fff2 {tty: serial, nand: onenand}: samsung: rename to fix build warning
Any arm config which has 'CONFIG_MTD_ONENAND_SAMSUNG=m' and
'CONFIG_SERIAL_SAMSUNG=m' gives a build warning:

warning: same module names found:
  drivers/tty/serial/samsung.ko
  drivers/mtd/nand/onenand/samsung.ko

Rename both drivers/tty/serial/samsung.c to
drivers/tty/serial/samsung_tty.c and drivers/mtd/nand/onenand/samsung.c
drivers/mtd/nand/onenand/samsung_mtd.c to fix the warning.

Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Acked-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20191117202435.28127-1-sudipm.mukherjee@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-18 18:49:11 +01:00
Luhua Xu
ae7c2d342a spi: mediatek: add SPI_CS_HIGH support
Change to use SPI_CS_HIGH to support spi CS polarity setting
for chips support enhance_timing.

Signed-off-by: Luhua Xu <luhua.xu@mediatek.com>
Link: https://lore.kernel.org/r/1574053037-26721-2-git-send-email-luhua.xu@mediatek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2019-11-18 17:47:00 +00:00
Chris Wilson
2d691aeca4 drm/i915/userptr: Try to acquire the page lock around set_page_dirty()
set_page_dirty says:

	For pages with a mapping this should be done under the page lock
	for the benefit of asynchronous memory errors who prefer a
	consistent dirty state. This rule can be broken in some special
	cases, but should be better not to.

Under those rules, it is only safe for us to use the plain set_page_dirty
calls for shmemfs/anonymous memory. Userptr may be used with real
mappings and so needs to use the locked version (set_page_dirty_lock).

However, following a try_to_unmap() we may want to remove the userptr and
so call put_pages(). However, try_to_unmap() acquires the page lock and
so we must avoid recursively locking the pages ourselves -- which means
that we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012
Fixes: 5cc9ed4b9a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
References: cb6d7c7dc7 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: 505a8ec7e1 ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"")
References: 6dcc693bc5 ("ext4: warn when page is dirtied without buffers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk
(cherry picked from commit 0d4bbe3d40)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit cee7fb437e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18 09:20:45 -08:00
Chris Wilson
add3eeed36 drm/i915/pmu: "Frequency" is reported as accumulated cycles
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
(cherry picked from commit e88866ef02)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit a7d87b70d6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18 09:20:38 -08:00
Ville Syrjälä
1aa4df7e41 drm/i915: Preload LUTs if the hw isn't currently using them
The LUTs are single buffered so in order to program them without
tearing we'd have to do it during vblank (actually to be 100%
effective it has to happen between start of vblank and frame start).
We have no proper mechanism for that at the moment so we just
defer loading them after the vblank waits have happened. That
is not quite sufficient (especially when committing multiple pipes
whose vblanks don't line up) so the LUT load will often leak into
the following frame causing tearing.

However in case the hardware wasn't previously using the LUT we
can preload it before setting the enable bit (which is double
buffered so won't tear). Let's determine if we can do such
preloading and make it happen. Slight variation between the
hardware requires some platforms specifics in the checks.

Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win
and Asus T100HA) when the gamma LUT gets loaded for the first
time as the BIOS has left some junk in the LUT memory.

v2: Deal with uapi vs. hw crtc state split
    s/GCM/CGM/ typo fix

Cc: Hans de Goede <hdegoede@redhat.com>
Fixes: 051a6d8d3c ("drm/i915: Move LUT programming to happen after vblank waits")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 0ccc42a2fd)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit f77021372e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18 09:20:27 -08:00
Ville Syrjälä
8ac495f624 drm/i915: Don't oops in dumb_create ioctl if we have no crtcs
Make sure we have a crtc before probing its primary plane's
max stride. Initially I thought we can't get this far without
crtcs, but looks like we can via the dumb_create ioctl.

Not sure if we shouldn't disable dumb buffer support entirely
when we have no crtcs, but that would require some amount of work
as the only thing currently being checked is dev->driver->dumb_create
which we'd have to convert to some device specific dynamic thing.

Cc: stable@vger.kernel.org
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: aa5ca8b742 ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit baea9ffe64)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit aeec766133)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-18 09:20:23 -08:00
Filipe Manana
042528f8d8 Btrfs: fix block group remaining RO forever after error during device replace
When doing a device replace, while at scrub.c:scrub_enumerate_chunks(), we
set the block group to RO mode and then wait for any ongoing writes into
extents of the block group to complete. While doing that wait we overwrite
the value of the variable 'ret' and can break out of the loop if an error
happens without turning the block group back into RW mode. So what happens
is the following:

1) btrfs_inc_block_group_ro() returns 0, meaning it set the block group
   to RO mode (its ->ro field set to 1 or incremented to some value > 1);

2) Then btrfs_wait_ordered_roots() returns a value > 0;

3) Then if either joining or committing the transaction fails, we break
   out of the loop wihtout calling btrfs_dec_block_group_ro(), leaving
   the block group in RO mode forever.

To fix this, just remove the code that waits for ongoing writes to extents
of the block group, since it's not needed because in the initial setup
phase of a device replace operation, before starting to find all chunks
and their extents, we set the target device for replace while holding
fs_info->dev_replace->rwsem, which ensures that after releasing that
semaphore, any writes into the source device are made to the target device
as well (__btrfs_map_block() guarantees that). So while at
scrub_enumerate_chunks() we only need to worry about finding and copying
extents (from the source device to the target device) that were written
before we started the device replace operation.

Fixes: f0e9b7d640 ("Btrfs: fix race setting block group readonly during device replace")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 18:07:55 +01:00
Qu Wenruo
b12de52896 btrfs: scrub: Don't check free space before marking a block group RO
[BUG]
When running btrfs/072 with only one online CPU, it has a pretty high
chance to fail:

  btrfs/072 12s ... _check_dmesg: something found in dmesg (see xfstests-dev/results//btrfs/072.dmesg)
  - output mismatch (see xfstests-dev/results//btrfs/072.out.bad)
      --- tests/btrfs/072.out     2019-10-22 15:18:14.008965340 +0800
      +++ /xfstests-dev/results//btrfs/072.out.bad      2019-11-14 15:56:45.877152240 +0800
      @@ -1,2 +1,3 @@
       QA output created by 072
       Silence is golden
      +Scrub find errors in "-m dup -d single" test
      ...

And with the following call trace:

  BTRFS info (device dm-5): scrub: started on devid 1
  ------------[ cut here ]------------
  BTRFS: Transaction aborted (error -27)
  WARNING: CPU: 0 PID: 55087 at fs/btrfs/block-group.c:1890 btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs]
  CPU: 0 PID: 55087 Comm: btrfs Tainted: G        W  O      5.4.0-rc1-custom+ #13
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs]
  Call Trace:
   __btrfs_end_transaction+0xdb/0x310 [btrfs]
   btrfs_end_transaction+0x10/0x20 [btrfs]
   btrfs_inc_block_group_ro+0x1c9/0x210 [btrfs]
   scrub_enumerate_chunks+0x264/0x940 [btrfs]
   btrfs_scrub_dev+0x45c/0x8f0 [btrfs]
   btrfs_ioctl+0x31a1/0x3fb0 [btrfs]
   do_vfs_ioctl+0x636/0xaa0
   ksys_ioctl+0x67/0x90
   __x64_sys_ioctl+0x43/0x50
   do_syscall_64+0x79/0xe0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  ---[ end trace 166c865cec7688e7 ]---

[CAUSE]
The error number -27 is -EFBIG, returned from the following call chain:
btrfs_end_transaction()
|- __btrfs_end_transaction()
   |- btrfs_create_pending_block_groups()
      |- btrfs_finish_chunk_alloc()
         |- btrfs_add_system_chunk()

This happens because we have used up all space of
btrfs_super_block::sys_chunk_array.

The root cause is, we have the following bad loop of creating tons of
system chunks:

1. The only SYSTEM chunk is being scrubbed
   It's very common to have only one SYSTEM chunk.
2. New SYSTEM bg will be allocated
   As btrfs_inc_block_group_ro() will check if we have enough space
   after marking current bg RO. If not, then allocate a new chunk.
3. New SYSTEM bg is still empty, will be reclaimed
   During the reclaim, we will mark it RO again.
4. That newly allocated empty SYSTEM bg get scrubbed
   We go back to step 2, as the bg is already mark RO but still not
   cleaned up yet.

If the cleaner kthread doesn't get executed fast enough (e.g. only one
CPU), then we will get more and more empty SYSTEM chunks, using up all
the space of btrfs_super_block::sys_chunk_array.

[FIX]
Since scrub/dev-replace doesn't always need to allocate new extent,
especially chunk tree extent, so we don't really need to do chunk
pre-allocation.

To break above spiral, here we introduce a new parameter to
btrfs_inc_block_group(), @do_chunk_alloc, which indicates whether we
need extra chunk pre-allocation.

For relocation, we pass @do_chunk_alloc=true, while for scrub, we pass
@do_chunk_alloc=false.
This should keep unnecessary empty chunks from popping up for scrub.

Also, since there are two parameters for btrfs_inc_block_group_ro(),
add more comment for it.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 18:07:55 +01:00
Johannes Thumshirn
7f0432d0d8 btrfs: change btrfs_fs_devices::rotating to bool
struct btrfs_fs_devices::rotating currently is declared as an integer
variable but only used as a boolean.

Change the variable definition to bool and update to code touching it to
set 'true' and 'false'.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:51 +01:00
Johannes Thumshirn
0395d84f8e btrfs: change btrfs_fs_devices::seeding to bool
struct btrfs_fs_devices::seeding currently is declared as an integer
variable but only used as a boolean.

Change the variable definition to bool and update to code touching it to
set 'true' and 'false'.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:51 +01:00
David Sterba
32da5386d9 btrfs: rename btrfs_block_group_cache
The type name is misleading, a single entry is named 'cache' while this
normally means a collection of objects. Rename that everywhere. Also the
identifier was quite long, making function prototypes harder to format.

Suggested-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:51 +01:00
Qu Wenruo
d49a2ddb15 btrfs: block-group: Reuse the item key from caller of read_one_block_group()
For read_one_block_group(), its only caller has already got the item key
to search next block group item.

So we can use that key directly without doing our own convertion on
stack.

Also, since that key used in btrfs_read_block_groups() is vital for
block group item search, add 'const' keyword for that parameter to
prevent read_one_block_group() to modify it.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:50 +01:00
Qu Wenruo
ffb9e0f05f btrfs: block-group: Refactor btrfs_read_block_groups()
Refactor the work inside the loop of btrfs_read_block_groups() into one
separate function, read_one_block_group().

This allows read_one_block_group to be reused for later BG_TREE feature.

The refactor does the following extra fix:
- Use btrfs_fs_incompat() to replace open-coded feature check

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:50 +01:00
David Sterba
d4e253bbbc btrfs: document extent buffer locking
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:50 +01:00
David Sterba
a4477988cf btrfs: access eb::blocking_writers according to ACCESS_ONCE policies
A nice writeup of the LKMM (Linux Kernel Memory Model) rules for access
once policies can be found here
https://lwn.net/Articles/799218/#Access-Marking%20Policies .

The locked and unlocked access to eb::blocking_writers should be
annotated accordingly, following this:

Writes:

- locked write must use ONCE, may use plain read
- unlocked write must use ONCE

Reads:

- unlocked read must use ONCE
- locked read may use plain read iff not mixed with unlocked read
- unlocked read then locked must use ONCE

There's one difference on the assembly level, where
btrfs_tree_read_lock_atomic and btrfs_try_tree_read_lock used the cached
value and did not reevaluate it after taking the lock. This could have
missed some opportunities to take the lock in case blocking writers
changed between the calls, but the window is just a few instructions
long. As this is in try-lock, the callers handle that.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:50 +01:00
David Sterba
40d38f53d4 btrfs: set blocking_writers directly, no increment or decrement
The increment and decrement was inherited from previous version that
used atomics, switched in commit 06297d8cef ("btrfs: switch
extent_buffer blocking_writers from atomic to int"). The only possible
values are 0 and 1 so we can set them directly.

The generated assembly (gcc 9.x) did the direct value assignment in
btrfs_set_lock_blocking_write (asm diff after change in 06297d8cef):

     5d:   test   %eax,%eax
     5f:   je     62 <btrfs_set_lock_blocking_write+0x22>
     61:   retq

  -  62:   lock incl 0x44(%rdi)
  -  66:   add    $0x50,%rdi
  -  6a:   jmpq   6f <btrfs_set_lock_blocking_write+0x2f>

  +  62:   movl   $0x1,0x44(%rdi)
  +  69:   add    $0x50,%rdi
  +  6d:   jmpq   72 <btrfs_set_lock_blocking_write+0x32>

The part in btrfs_tree_unlock did a decrement because
BUG_ON(blockers > 1) is probably not a strong hint for the compiler, but
otherwise the output looks safe:

  - lock decl 0x44(%rdi)

  + sub    $0x1,%eax
  + mov    %eax,0x44(%rdi)

Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:50 +01:00
David Sterba
f5c2a52590 btrfs: merge blocking_writers branches in btrfs_tree_read_lock
There are two ifs that use eb::blocking_writers. As this is a variable
modified inside and outside of locks, we could minimize number of
accesses to avoid problems with getting different results at different
times.

The access here is locked so this can only race with btrfs_tree_unlock
that sets blocking_writers to 0 without lock and unsets the lock owner.

The first branch is taken only if the same thread already holds the
lock, the second if checks for blocking writers. Here we'd either unlock
and wait, or proceed. Both are valid states of the locking protocol.

Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:50 +01:00
David Sterba
9c907446dc btrfs: drop incompat bit for raid1c34 after last block group is gone
When there are no raid1c3 or raid1c4 block groups left after balance
(either convert or with other filters applied), remove the incompat bit.
This is already done for RAID56, do the same for RAID1C34.

Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 17:51:49 +01:00