Commit Graph

827945 Commits

Author SHA1 Message Date
Ramalingam C
5e23491175 misc/mei/hdcp: Enabling the HDCP authentication
Request to ME to configure a port as authenticated.

On Success, ME FW will mark the port as authenticated and provides
HDCP cipher with the encryption keys.

Enabling the Authentication can be requested once all stages of
HDCP2.2 authentication is completed by interacting with ME FW.

Only after this stage, driver can enable the HDCP encryption for
the port, through HW registers.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
  Style and typos fixed [Uma]
v5: Rebased.
v6:
  Collected the Rb-ed by.
  Rebased.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc addition. [Tomas]
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  K-Doc fix. [Tomas]
v11:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-14-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:49 +01:00
Ramalingam C
0a1af1b5c1 misc/mei/hdcp: Verify M_prime
Request to ME to verify the M_Prime received from the HDCP sink.

ME FW will calculate the M and compare with M_prime received
as part of RepeaterAuth_Stream_Ready, which is HDCP2.2 protocol msg.

On successful completion of this stage, downstream propagation of
the stream management info is completed.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
  endianness conversion func is moved to drm_hdcp.h [Uma]
v5: Rebased.
v6:
  Collected the Rb-ed by.
  Rebasing.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc addition. [Tomas]
  drm_hdcp2_u32_to_seq_num() is used for u32 to seq_num.
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  K-Doc fix. [Tomas]
v11:
  %s/__swab16/cpu_to_be16 [Tomas]

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-13-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:44 +01:00
Ramalingam C
f46ea842ed misc/mei/hdcp: Repeater topology verification and ack
Request ME to verify the downstream topology information received.

ME FW will validate the Repeaters receiver id list and
downstream topology.

On Success ME FW will provide the Least Significant
128bits of VPrime, which forms the repeater ack.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
  Style and typos fixed [Uma]
v5: Rebased.
v6: Rebasing.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc addition. [Tomas]
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  K-Doc fix. [Tomas]
v11:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-12-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:38 +01:00
Ramalingam C
b491264fca misc/mei/hdcp: Prepare Session Key
Request to ME to prepare the encrypted session key.

On Success, ME provides Encrypted session key. Function populates
the HDCP2.2 authentication msg SKE_Send_Eks.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
  Style fixed [Uma]
v5: Rebased.
v6:
  Collected the Rb-ed by.
  Rebasing.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc addition. [Tomas]
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  K-Doc fix. [Tomas]
v11:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-11-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:32 +01:00
Ramalingam C
45479b67be misc/mei/hdcp: Verify L_prime
Request to ME to verify the LPrime received from HDCP sink.

On Success, ME FW will verify the received Lprime by calculating and
comparing with L.

This represents the completion of Locality Check.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
  Style fixed [Uma]
v5: Rebased.
v6:
  Collected the Rb-ed by.
  Rebasing.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc addition. [Tomas]
  memcpy for const length.
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  K-Doc fix. [Tomas]
v11:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-10-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:26 +01:00
Ramalingam C
682932f3e1 misc/mei/hdcp: Initiate Locality check
Requests ME to start the second stage of HDCP2.2 authentication,
called Locality Check.

On Success, ME FW will provide LC_Init message to send to hdcp sink.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd used for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
  Style fixed [Uma]
v5: Rebased.
v6:
  Collected the Rb-ed by.
  Rebasing.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc addition. [Tomas]
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  K-Doc fix. [Tomas]
v11:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-9-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:20 +01:00
Ramalingam C
6a1a00a30e misc/mei/hdcp: Store the HDCP Pairing info
Provides Pairing info to ME to store.

Pairing is a process to fast track the subsequent authentication
with the same HDCP sink.

On Success, received HDCP pairing info is stored in non-volatile
memory of ME.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
  Style fixed [Uma]
v5: Rebased.
v6:
  Collected the Rb-ed by.
  Rebasing.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc addition. [Tomas]
  memcpy for const length.
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  K-Doc fix. [Tomas]
v11:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-8-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:14 +01:00
Ramalingam C
a7dcbed2bb misc/mei/hdcp: Verify H_prime
Requests for the verification of AKE_Send_H_prime.

ME will calculate the H and comparing it with received H_Prime.
The result will be returned as status.

Here AKE_Send_H_prime is a HDCP2.2 Authentication msg.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
  Styles and typos fixed [Uma]
v5: Rebased.
v6:
  Collected the Rb-ed by.
  Rebasing.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc Addition [Tomas]
  memcpy for const length.
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  K-Doc fix. [Tomas]
v11:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-7-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:09 +01:00
Ramalingam C
39b71c2baa misc/mei/hdcp: Verify Receiver Cert and prepare km
Requests for verification for receiver certification and also the
preparation for next AKE auth message with km.

On Success ME FW validate the HDCP2.2 receivers certificate and do the
revocation check on the receiver ID. AKE_Stored_Km will be prepared if
the receiver is already paired, else AKE_No_Stored_Km will be prepared.

Here AKE_Stored_Km and AKE_No_Stored_Km are HDCP2.2 protocol msgs.

v2: Rebased.
v3:
  cldev is passed as first parameter [Tomas]
  Redundant comments and cast are removed [Tomas]
v4:
  %zd is used for ssize_t [Alexander]
  %s/return -1/return -EIO [Alexander]
v5: Rebased.
v6:
  Collected the Rb-ed by.
  Rebasing.
v7:
  Adjust to the new mei interface.
  Fix for Kdoc.
v8:
  K-Doc Addition. [Tomas]
  memcpy for const length.
v9:
  renamed func as mei_hdcp_* [Tomas]
  Inline function is defined for DDI index [Tomas]
v10:
  Fixed the conversion of u8 to bool [Tomas]
  K-Doc fix [Tomas]
v11:
  Rebased.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-6-git-send-email-ramalingam.c@intel.com
2019-02-25 17:02:01 +01:00
Ramalingam C
a37fb1e473 misc/mei/hdcp: Initiate Wired HDCP2.2 Tx Session
Request ME FW to start the HDCP2.2 session for an intel port.
Prepares payloads for command WIRED_INITIATE_HDCP2_SESSION and sends
to ME FW.

On Success, ME FW will start a HDCP2.2 session for the port and
provides the content for HDCP2.2 AKE_Init message.

v2: Rebased.
v3:
  cldev is add as a separate parameter [Tomas]
  Redundant comment and typecast are removed [Tomas]
v4:
  %zd is used for size [Alexander]
  %s/return -1/return -EIO [Alexander]
  Spellings in commit msg is fixed [Uma]
v5: Rebased.
v6:
  Collected the rb-ed by.
  Realigning the patches in the series.
v7:
  Adjust to the new mei interface.
  Fix for kdoc.
v8:
  K-Doc Addition.
  memcpy for const length.
v9:
  s/mei_hdcp_ddi/mei_fw_ddi
  s/i915_port/mei_i915_port [Tomas]
  renamed func as mei_hdcp_* [Tomas]
  Instead of macro, inline func for ddi index is used. [Tomas]
v10:
  Switch case for the coversion between i915_port to mei_ddi [Tomas]
  Kernel doc fix.
v11:
  mei_hdcp_ops is defined as const. [Tomas]

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-5-git-send-email-ramalingam.c@intel.com
2019-02-25 17:01:43 +01:00
Ramalingam C
cf8ecce202 misc/mei/hdcp: Define ME FW interface for HDCP2.2
Defines the HDCP specific ME FW interfaces such as Request CMDs,
payload structure for CMDs and their response status codes.

This patch defines payload size(Excluding the Header)for each WIRED
HDCP2.2 CMDs.

v2: Rebased.
v3:
  Extra comments are removed.
v4:
  %s/\/\*\*/\/\*
v5:
  Extra lines are removed.
v6:
  Remove redundant text from the License header
  %s/LPRIME_HALF/V_PRIME_HALF
  %s/uintxx_t/uxx
v7:
  Extra taps removed.
v8:
  k is defined as __be16 [Tomas]

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Acked-by Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-4-git-send-email-ramalingam.c@intel.com
2019-02-25 17:01:36 +01:00
Ramalingam C
64e9bbdd95 misc/mei/hdcp: Client driver for HDCP application
ME FW contributes a vital role in HDCP2.2 authentication.
HDCP2.2 driver needs to communicate to ME FW for each step of the
HDCP2.2 authentication.

ME FW prepare and HDCP2.2 authentication  parameters and encrypt them
as per spec. With such parameter Driver prepares HDCP2.2 auth messages
and communicate with HDCP2.2 sink.

Similarly HDCP2.2 sink's response is shared with ME FW for decrypt and
verification.

Once All the steps of HDCP2.2 authentications are complete on driver's
request ME FW will configure the port as authenticated and supply the
HDCP keys to the Gen HW for encryption.

Only after this stage HDCP2.2 driver can start the HDCP2.2 encryption
for a port.

ME FW is interfaced to kernel through MEI Bus Driver. To obtain the
HDCP2.2 services from the ME FW through MEI Bus driver MEI Client
Driver is developed.

v2:
  hdcp files are moved to drivers/misc/mei/hdcp/ [Tomas]
v3:
  Squashed the Kbuild support [Tomas]
  UUID renamed and Module License is modified [Tomas]
  drv_data is set to null at remove [Tomas]
v4:
  Module name is changed to "MEI HDCP"
  I915 Selects the MEI_HDCP
v5:
  Remove redundant text from the License header
  Fix malformed licence
  Removed the drv_data resetting.
v6:
  K-Doc addition. [Tomas]
v7:
  %s/UUID_LE/GUID_INIT [Tomas]
  GPL Ver is 2.0 than 2.0+ [Tomas]
v8:
  Added more info into Kconfig addition [Tomas]

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-3-git-send-email-ramalingam.c@intel.com
2019-02-25 17:01:25 +01:00
Tomas Winkler
d1e204e8ca mei: bus: whitelist hdcp client
Whitelist HDCP client for in kernel drm use

v2:
  Rebased.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550772730-23280-2-git-send-email-ramalingam.c@intel.com
2019-02-25 17:01:11 +01:00
Takashi Iwai
cfc35f9c12 ALSA: hda: Extend i915 component bind timeout
I set 10 seconds for the timeout of the i915 audio component binding
with a hope that recent machines are fast enough to handle all probe
tasks in that period, but I was too optimistic.  The binding may take
longer than that, and this caused a problem on the machine with both
audio and graphics driver modules loaded in parallel, as Paul Menzel
experienced.  This problem haven't hit so often just because the KMS
driver is loaded in initrd on most machines.

As a simple workaround, extend the timeout to 60 seconds.

Fixes: f9b54e1961 ("ALSA: hda/i915: Allow delayed i915 audio component binding")
Reported-by: Paul Menzel <pmenzel+alsa-devel@molgen.mpg.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-02-25 16:59:02 +01:00
Juerg Haefliger
0e27ded115 selftests/ftrace: Handle the absence of tput
In environments where tput is not available, we get the following
error
$ ./ftracetest: 163: [: Illegal number:
because ncolors is an empty string. Fix that by setting it to 0 if the
tput command fails.

Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
2019-02-25 07:48:01 -07:00
Jonathan Neuschäfer
c9bd505dbd mmc: spi: Fix card detection during probe
When using the mmc_spi driver with a card-detect pin, I noticed that the
card was not detected immediately after probe, but only after it was
unplugged and plugged back in (and the CD IRQ fired).

The call tree looks something like this:

mmc_spi_probe
  mmc_add_host
    mmc_start_host
      _mmc_detect_change
        mmc_schedule_delayed_work(&host->detect, 0)
          mmc_rescan
            host->bus_ops->detect(host)
              mmc_detect
                _mmc_detect_card_removed
                  host->ops->get_cd(host)
                    mmc_gpio_get_cd -> -ENOSYS (ctx->cd_gpio not set)
  mmc_gpiod_request_cd
    ctx->cd_gpio = desc

To fix this issue, call mmc_detect_change after the card-detect GPIO/IRQ
is registered.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-25 15:40:36 +01:00
Trond Myklebust
06b5fc3ad9 Merge tag 'nfs-rdma-for-5.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
NFSoRDMA client updates for 5.1

New features:
- Convert rpc auth layer to use xdr_streams
- Config option to disable insecure enctypes
- Reduce size of RPC receive buffers

Bugfixes and cleanups:
- Fix sparse warnings
- Check inline size before providing a write chunk
- Reduce the receive doorbell rate
- Various tracepoint improvements

[Trond: Fix up merge conflicts]
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-25 09:35:49 -05:00
Linus Walleij
badf14ceba hwmon: (ad741x) Add DT bindings for Analog Devices AD741x
This adds device tree bindings for Analog Devices AD741x
as found in Gateway routers.

Cc: devicetree@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-02-25 06:35:42 -08:00
Ulf Hansson
9d2d24302e mmc: core: Move mmc_of_parse_voltage() to host.c
MMC OF parsing functions, which parses various host DT properties, should
stay close to each other. Therefore, let's move mmc_of_parse_voltage()
close to mmc_of_parse() into host.c.

Additionally, there is no reason to build the code only when CONFIG_OF is
set, as there should be stub functions for the OF helpers that is being
used, so let's drop this condition as well.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-25 15:20:58 +01:00
Ulf Hansson
3958790e67 mmc: core: Convert mmc_regulator_get_ocrmask() to static
The only left user of mmc_regulator_get_ocrmask() is the mmc core itself.
Therefore, let's drop the export and turn it into static.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-25 15:20:58 +01:00
Ulf Hansson
de13d5a44e mmc: core: Move regulator helpers to separate file
The mmc regulator helper functions, are placed in the extensive core.c
file.  In a step towards trying to create a better structure of files,
avoiding too many lines of code per file, let's move these helpers to a new
file, regulator.c.

Moreover, this within this context it makes sense to also drop the export
of mmc_vddrange_to_ocrmask(), but instead let's make it internal to the mmc
core.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-25 15:20:58 +01:00
Ulf Hansson
643108630e mmc: of_mmc_spi: Convert to mmc_of_parse_voltage()
Let's drop the open-coding of the parsing of the "voltage-ranges" DT
property and convert to use the common mmc_of_parse_voltage() API instead.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-25 15:20:58 +01:00
Ulf Hansson
03cd5c05d4 mmc: core: Drop retries as in-parameter to mmc_wait_for_app_cmd()
All callers of mmc_wait_for_app_cmd() set the retries in-parameter to
MMC_CMD_RETRIES. This is silly, so let's just drop the in-parameter
altogether, as to simplify the code.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-25 15:20:58 +01:00
Ulf Hansson
9a4b869b0c mmc: core: Convert mmc_wait_for_app_cmd() to static
mmc_wait_for_app_cmd() is an internal function for sd_ops.c, thus let's
drop the unnecessary export and turn it into static function.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-25 15:20:58 +01:00
Linus Walleij
c0162a49e0 gpio: amd-fch: Drop const from resource
The build servers and linux-next are complaining like this:

drivers/gpio/gpio-amd-fch.c: In function 'amd_fch_gpio_probe':
drivers/gpio/gpio-amd-fch.c:164:49: warning: passing argument 2 of
'devm_ioremap_resource' discards 'const' qualifier from pointer
target type [-Wdiscarded-qualifiers]
priv->base = devm_ioremap_resource(&pdev->dev, &amd_fch_gpio_iores);
                                               ^~~~~~~~~~~~~~~~~~~
In file included from include/linux/platform_device.h:14, from
drivers/gpio/gpio-amd-fch.c:15:
include/linux/device.h:709:15: note: expected 'struct resource *'
but argument is of type 'const struct resource *'
 void __iomem *devm_ioremap_resource(struct device *dev,struct resource *res);
               ^~~~~~~~~~~~~~~~~~~~~

Let's just remove "const" for now.

It is possible that devm_ioremap_resource() should rather
be constified so we can pass const resources as arguments.
But right now I just want to get rid of this build warning.

Fixes: e09d168f13 ("gpio: AMD G-Series PCH gpio driver")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Cc: Enrico Weigelt <info@metux.net>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-02-25 15:04:07 +01:00
Andi Kleen
94816add00 perf tools: Add perf_exe() helper to find perf binary
Also convert one existing user.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224153722.27020-9-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:58:28 -03:00
Andi Kleen
4b6ac811bc perf script: Handle missing fields with -F +..
When using -F + syntax to add a field the existing defaults are
currently all marked user_set. This can cause errors when some field is
missing in the perf.data

This patch tracks the actually user set fields separately, so that we don't
error out in this case.

Before:

  % perf record true
  % perf script -F +metric
  Samples for 'cycles:ppp' event do not have CPU attribute set. Cannot print 'cpu' field.
  %

After:

  5 perf record true
  % perf script -F +metric
              perf 28936 278636.237688:          1 cycles:ppp:  ffffffff8117da99 perf_event_exec+0x59 (/lib/modules/4.20.0-odilo/build/vmlinux)
  ...
  %

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224153722.27020-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:58:07 -03:00
Takeshi Saito
f0c8234cb9 mmc: renesas_sdhi: Change HW adjustment register according to speed mode
SCC is used for SDR104/HS200/HS400. We need to change SCC_DT2FF
according to the mode. If it is inappropriate, CRC error tends to occur.

This adds variable "tap_hs400" for HS400 mode and configures SCC_DT2FF
as needed.

Signed-off-by: Takeshi Saito <takeshi.saito.xv@renesas.com>
[wsa: rebased to upstream and updated commit message]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
Tested-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-25 14:45:59 +01:00
Jiri Olsa
eb6176709b perf data: Add perf_data__open_dir_data function
Add perf_data__open_dir_data to open files inside 'struct perf_data'
path directory:

   static int perf_data__open_dir(struct perf_data *data);

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:43:07 -03:00
Jiri Olsa
1455206311 perf data: Add perf_data__(create_dir|close_dir) functions
Add perf_data__create_dir() to create nr files inside 'struct perf_data'
path directory:

  int perf_data__create_dir(struct perf_data *data, int nr);

and function to close that data:

  void perf_data__close_dir(struct perf_data *data);

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:42:05 -03:00
Jiri Olsa
ccb7a71dce perf data: Fail check_backup in case of error
And display the error message from removing the old data file:

  $ perf record ls
  Can't remove old data: Permission denied (perf.data.old)
  Perf session creation failed.

  $ perf record ls
  Can't remove old data: Unknown file found (perf.data.old)
  Perf session creation failed.

Not sure how to make fail the rename (after we successfully remove the
destination file/dir) to show the message, anyway let's have it there.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:37:01 -03:00
Jiri Olsa
5021fc4e8c perf data: Make check_backup work over directories
Change check_backup() to call rm_rf_perf_data() instead of unlink() to
work over directory paths.

Also move the call earlier in the code, before we fork for file/dir, so
it can backup also directory data.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:35:19 -03:00
Jiri Olsa
c69e4c37b3 perf tools: Add rm_rf_perf_data function
To remove perf.data including the directory, with checking on expected
files and no other directories inside.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Suggested-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:33:51 -03:00
Jiri Olsa
cdb6b0235f perf tools: Add pattern name checking to rm_rf
Add pattern argument to rm_rf_depth() (and rename it to rm_rf_depth_pat())
to specify the name pattern files need to match inside the directory.

The function fails if we find different file to remove.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:33:04 -03:00
Jiri Olsa
05a4865939 perf tools: Add depth checking to rm_rf
Adding depth argument to rm_rf (and renaming it to rm_rf_depth) to
specify the depth we will go searching for files to remove.

It will be used to specify single depth for perf.data directory removal
in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-25 10:32:11 -03:00
YueHaibing
f65e25e343 btrfs: Remove unnecessary casts in btrfs_read_root_item
There is a messy cast here:
	min_t(int, len, (int)sizeof(*item)));

min_t() should normally cast to unsigned.  It's not possible for "len"
to be negative, but if it were then we definitely wouldn't want to pass
negatives to read_extent_buffer().  Also there is an extra cast.

This patch shouldn't affect runtime, it's just a clean up.

Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:19:23 +01:00
Filipe Manana
253002f2e3 Btrfs: remove assertion when searching for a key in a node/leaf
At ctree.c:key_search(), the assertion that verifies the first key on a
child extent buffer corresponds to the key at a specific slot in the
parent has a disadvantage: we effectively hit a BUG_ON() which requires
rebooting the machine later. It also does not tell any information about
which extent buffer is affected, from which root, the expected and found
keys, etc.

However as of commit 581c176041 ("btrfs: Validate child tree block's
level and first key"), that assertion is not needed since at the time we
read an extent buffer from disk we validate that its first key matches the
key, at the respective slot, in the parent extent buffer. Therefore just
remove the assertion at key_search().

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:19:23 +01:00
Filipe Manana
cbca7d59fe Btrfs: add missing error handling after doing leaf/node binary search
The function map_private_extent_buffer() can return an -EINVAL error, and
it is called by generic_bin_search() which will return back the error. The
btrfs_bin_search() function in turn calls generic_bin_search() and the
key_search() function calls btrfs_bin_search(), so both can return the
-EINVAL error coming from the map_private_extent_buffer() function. Some
callers of these functions were ignoring that these functions can return
an error, so fix them to deal with error return values.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:19:23 +01:00
Dan Carpenter
669e859b5e btrfs: drop the lock on error in btrfs_dev_replace_cancel
We should drop the lock on this error path.  This has been found by a
static tool.

The lock needs to be released, it's there to protect access to the
dev_replace members and is not supposed to be left locked. The value of
state that's being switched would need to be artifically changed to an
invalid value so the default: branch is taken.

Fixes: d189dd70e2 ("btrfs: fix use-after-free due to race between replace start and cancel")
CC: stable@vger.kernel.org # 5.0+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:41 +01:00
Johannes Thumshirn
349ae63f40 btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
We recently had a customer issue with a corrupted filesystem. When
trying to mount this image btrfs panicked with a division by zero in
calc_stripe_length().

The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length()
takes this value and divides it by the number of copies the RAID profile
is expected to have to calculate the amount of data stripes. As a DUP
profile is expected to have 2 copies this division resulted in 1/2 = 0.
Later then the 'data_stripes' variable is used as a divisor in the
stripe length calculation which results in a division by 0 and thus a
kernel panic.

When encountering a filesystem with a DUP block group and a
'num_stripes' value unequal to 2, refuse mounting as the image is
corrupted and will lead to unexpected behaviour.

Code inspection showed a RAID1 block group has the same issues.

Fixes: e06cd3dd7c ("Btrfs: add validadtion checks for chunk loading")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:41 +01:00
Dan Robertson
e49be14b8d btrfs: init csum_list before possible free
The scrub_ctx csum_list member must be initialized before scrub_free_ctx
is called. If the csum_list is not initialized beforehand, the
list_empty call in scrub_free_csums will result in a null deref if the
allocation fails in the for loop.

Fixes: a2de733c78 ("btrfs: scrub")
CC: stable@vger.kernel.org # 3.0+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:41 +01:00
Filipe Manana
57a50e2506 Btrfs: remove no longer needed range length checks for deduplication
Comparing the content of the pages in the range to deduplicate is now
done in generic_remap_checks called by the generic helper
generic_remap_file_range_prep(), which takes care of ensuring we do not
compare/deduplicate undefined data beyond a file's EOF (range from EOF
to the next block boundary). So remove these checks which are now
redundant.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:40 +01:00
Filipe Manana
a3baaf0d78 Btrfs: fix fsync after succession of renames and unlink/rmdir
After a succession of renames operations of different files and unlinking
one of them, if we fsync one of the renamed files we can end up with a
log that will either fail to replay at mount time or result in a filesystem
that is in an inconsistent state. One example scenario:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ mkdir /mnt/testdir
  $ touch /mnt/testdir/fname1
  $ touch /mnt/testdir/fname2

  $ sync

  $ mv /mnt/testdir/fname1 /mnt/testdir/fname3
  $ rm -f /mnt/testdir/fname2
  $ ln /mnt/testdir/fname3 /mnt/testdir/fname2

  $ touch /mnt/testdir/fname1
  $ xfs_io -c "fsync" /mnt/testdir/fname1

  <power failure>

  $ mount /dev/sdb /mnt
  $ umount /mnt
  $ btrfs check /dev/sdb
  [1/7] checking root items
  [2/7] checking extents
  [3/7] checking free space cache
  [4/7] checking fs roots
  root 5 inode 259 errors 2, no orphan item
  ERROR: errors found in fs roots
  Opening filesystem to check...
  Checking filesystem on /dev/sdc
  UUID: 20e4abb8-5a19-4492-8bb4-6084125c2d0d
  found 393216 bytes used, error(s) found
  total csum bytes: 0
  total tree bytes: 131072
  total fs tree bytes: 32768
  total extent tree bytes: 16384
  btree space waste bytes: 122986
  file data blocks allocated: 262144
   referenced 262144

On a kernel without the first patch in this series, titled
"[PATCH] Btrfs: fix fsync after succession of renames of different files",
we get instead an error when mounting the filesystem due to failure of
replaying the log:

  $ mount /dev/sdb /mnt
  mount: mount /dev/sdb on /mnt failed: File exists

Fix this by logging the parent directory of an inode whenever we find an
inode that no longer exists (was unlinked in the current transaction),
during the procedure which finds inodes that have old names that collide
with new names of other inodes.

A test case for fstests follows soon.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:40 +01:00
Filipe Manana
6b5fc433a7 Btrfs: fix fsync after succession of renames of different files
After a succession of rename operations of different files and fsyncing
one of them, such that each file gets a new name that corresponds to an
old name of another file, we can end up with a log that will cause a
failure when attempted to replay at mount time (an EEXIST error).
We currently have correct behaviour when such succession of renames
involves only two files, but if there are more files involved, we end up
not logging all the inodes that are needed, therefore resulting in a
failure when attempting to replay the log.

Example:

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt

  $ mkdir /mnt/testdir
  $ touch /mnt/testdir/fname1
  $ touch /mnt/testdir/fname2

  $ sync

  $ mv /mnt/testdir/fname1 /mnt/testdir/fname3
  $ mv /mnt/testdir/fname2 /mnt/testdir/fname4
  $ ln /mnt/testdir/fname3 /mnt/testdir/fname2

  $ touch /mnt/testdir/fname1
  $ xfs_io -c "fsync" /mnt/testdir/fname1

  <power failure>

  $ mount /dev/sdb /mnt
  mount: mount /dev/sdb on /mnt failed: File exists

So fix this by checking all inode dependencies when logging an inode. That
is, if one logged inode A has a new name that matches the old name of some
other inode B, check if inode B has a new name that matches the old name
of some other inode C, and so on. This fix is implemented not by doing any
recursive function calls but by using an iterative method using a linked
list that is used in a first-in-first-out fashion.

A test case for fstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:40 +01:00
Josef Bacik
38e3eebff6 btrfs: honor path->skip_locking in backref code
Qgroups will do the old roots lookup at delayed ref time, which could be
while walking down the extent root while running a delayed ref.  This
should be fine, except we specifically lock eb's in the backref walking
code irrespective of path->skip_locking, which deadlocks the system.
Fix up the backref code to honor path->skip_locking, nobody will be
modifying the commit_root when we're searching so it's completely safe
to do.

This happens since fb235dc06f ("btrfs: qgroup: Move half of the qgroup
accounting time out of commit trans"), kernel may lockup with quota
enabled.

There is one backref trace triggered by snapshot dropping along with
write operation in the source subvolume.  The example can be reliably
reproduced:

  btrfs-cleaner   D    0  4062      2 0x80000000
  Call Trace:
   schedule+0x32/0x90
   btrfs_tree_read_lock+0x93/0x130 [btrfs]
   find_parent_nodes+0x29b/0x1170 [btrfs]
   btrfs_find_all_roots_safe+0xa8/0x120 [btrfs]
   btrfs_find_all_roots+0x57/0x70 [btrfs]
   btrfs_qgroup_trace_extent_post+0x37/0x70 [btrfs]
   btrfs_qgroup_trace_leaf_items+0x10b/0x140 [btrfs]
   btrfs_qgroup_trace_subtree+0xc8/0xe0 [btrfs]
   do_walk_down+0x541/0x5e3 [btrfs]
   walk_down_tree+0xab/0xe7 [btrfs]
   btrfs_drop_snapshot+0x356/0x71a [btrfs]
   btrfs_clean_one_deleted_snapshot+0xb8/0xf0 [btrfs]
   cleaner_kthread+0x12b/0x160 [btrfs]
   kthread+0x112/0x130
   ret_from_fork+0x27/0x50

When dropping snapshots with qgroup enabled, we will trigger backref
walk.

However such backref walk at that timing is pretty dangerous, as if one
of the parent nodes get WRITE locked by other thread, we could cause a
dead lock.

For example:

           FS 260     FS 261 (Dropped)
            node A        node B
           /      \      /      \
       node C      node D      node E
      /   \         /  \        /     \
  leaf F|leaf G|leaf H|leaf I|leaf J|leaf K

The lock sequence would be:

      Thread A (cleaner)             |       Thread B (other writer)
-----------------------------------------------------------------------
write_lock(B)                        |
write_lock(D)                        |
^^^ called by walk_down_tree()       |
                                     |       write_lock(A)
                                     |       write_lock(D) << Stall
read_lock(H) << for backref walk     |
read_lock(D) << lock owner is        |
                the same thread A    |
                so read lock is OK   |
read_lock(A) << Stall                |

So thread A hold write lock D, and needs read lock A to unlock.
While thread B holds write lock A, while needs lock D to unlock.

This will cause a deadlock.

This is not only limited to snapshot dropping case.  As the backref
walk, even only happens on commit trees, is breaking the normal top-down
locking order, makes it deadlock prone.

Fixes: fb235dc06f ("btrfs: qgroup: Move half of the qgroup accounting time out of commit trans")
CC: stable@vger.kernel.org # 4.14+
Reported-and-tested-by: David Sterba <dsterba@suse.com>
Reported-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
[ rebase to latest branch and fix lock assert bug in btrfs/007 ]
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ copy logs and deadlock analysis from Qu's patch ]
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:39 +01:00
Qu Wenruo
f5fef45936 btrfs: qgroup: Make qgroup async transaction commit more aggressive
[BUG]
Btrfs qgroup will still hit EDQUOT under the following case:

  $ dev=/dev/test/test
  $ mnt=/mnt/btrfs
  $ umount $mnt &> /dev/null
  $ umount $dev &> /dev/null

  $ mkfs.btrfs -f $dev
  $ mount $dev $mnt -o nospace_cache

  $ btrfs subv create $mnt/subv
  $ btrfs quota enable $mnt
  $ btrfs quota rescan -w $mnt
  $ btrfs qgroup limit -e 1G $mnt/subv

  $ fallocate -l 900M $mnt/subv/padding
  $ sync

  $ rm $mnt/subv/padding

  # Hit EDQUOT
  $ xfs_io -f -c "pwrite 0 512M" $mnt/subv/real_file

[CAUSE]
Since commit a514d63882 ("btrfs: qgroup: Commit transaction in advance
to reduce early EDQUOT"), btrfs is not forced to commit transaction to
reclaim more quota space.

Instead, we just check pertrans metadata reservation against some
threshold and try to do asynchronously transaction commit.

However in above case, the pertrans metadata reservation is pretty small
thus it will never trigger asynchronous transaction commit.

[FIX]
Instead of only accounting pertrans metadata reservation, we calculate
how much free space we have, and if there isn't much free space left,
commit transaction asynchronously to try to free some space.

This may slow down the fs when we have less than 32M free qgroup space,
but should reduce a lot of false EDQUOT, so the cost should be
acceptable.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:39 +01:00
Qu Wenruo
1418bae1c2 btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record
[BUG]
Btrfs/139 will fail with a high probability if the testing machine (VM)
has only 2G RAM.

Resulting the final write success while it should fail due to EDQUOT,
and the fs will have quota exceeding the limit by 16K.

The simplified reproducer will be: (needs a 2G ram VM)

  $ mkfs.btrfs -f $dev
  $ mount $dev $mnt

  $ btrfs subv create $mnt/subv
  $ btrfs quota enable $mnt
  $ btrfs quota rescan -w $mnt
  $ btrfs qgroup limit -e 1G $mnt/subv

  $ for i in $(seq -w  1 8); do
  	xfs_io -f -c "pwrite 0 128M" $mnt/subv/file_$i > /dev/null
  	echo "file $i written" > /dev/kmsg
    done
  $ sync
  $ btrfs qgroup show -pcre --raw $mnt

The last pwrite will not trigger EDQUOT and final 'qgroup show' will
show something like:

  qgroupid         rfer         excl     max_rfer     max_excl parent  child
  --------         ----         ----     --------     -------- ------  -----
  0/5             16384        16384         none         none ---     ---
  0/256      1073758208   1073758208         none   1073741824 ---     ---

And 1073758208 is larger than
  > 1073741824.

[CAUSE]
It's a bug in btrfs qgroup data reserved space management.

For quota limit, we must ensure that:
  reserved (data + metadata) + rfer/excl <= limit

Since rfer/excl is only updated at transaction commmit time, reserved
space needs to be taken special care.

One important part of reserved space is data, and for a new data extent
written to disk, we still need to take the reserved space until
rfer/excl numbers get updated.

Originally when an ordered extent finishes, we migrate the reserved
qgroup data space from extent_io tree to delayed ref head of the data
extent, expecting delayed ref will only be cleaned up at commit
transaction time.

However for small RAM machine, due to memory pressure dirty pages can be
flushed back to disk without committing a transaction.

The related events will be something like:

  file 1 written
  btrfs_finish_ordered_io: ino=258 ordered offset=0 len=54947840
  btrfs_finish_ordered_io: ino=258 ordered offset=54947840 len=5636096
  btrfs_finish_ordered_io: ino=258 ordered offset=61153280 len=57344
  btrfs_finish_ordered_io: ino=258 ordered offset=61210624 len=8192
  btrfs_finish_ordered_io: ino=258 ordered offset=60583936 len=569344
  cleanup_ref_head: num_bytes=54947840
  cleanup_ref_head: num_bytes=5636096
  cleanup_ref_head: num_bytes=569344
  cleanup_ref_head: num_bytes=57344
  cleanup_ref_head: num_bytes=8192
  ^^^^^^^^^^^^^^^^ This will free qgroup data reserved space
  file 2 written
  ...
  file 8 written
  cleanup_ref_head: num_bytes=8192
  ...
  btrfs_commit_transaction  <<< the only transaction committed during
				the test

When file 2 is written, we have already freed 128M reserved qgroup data
space for ino 258. Thus later write won't trigger EDQUOT.

This allows us to write more data beyond qgroup limit.

In my 2G ram VM, it could reach about 1.2G before hitting EDQUOT.

[FIX]
By moving reserved qgroup data space from btrfs_delayed_ref_head to
btrfs_qgroup_extent_record, we can ensure that reserved qgroup data
space won't be freed half way before commit transaction, thus fix the
problem.

Fixes: f64d5ca868 ("btrfs: delayed_ref: Add new function to record reserved space into delayed ref")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:39 +01:00
David Sterba
0ea8207626 btrfs: scrub: remove unused nocow worker pointer
The member btrfs_fs_info::scrub_nocow_workers is unused since the nocow
optimization was removed from scrub in 9bebe665c3 ("btrfs: scrub:
Remove unused copy_nocow_pages and its callchain").

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:38 +01:00
David Sterba
c835294274 btrfs: scrub: add assertions for worker pointers
The scrub worker pointers are not NULL iff the scrub is running, so
reset them back once the last reference is dropped. Add assertions to
the initial phase of scrub to verify that.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:38 +01:00
Anand Jain
ff09c4ca59 btrfs: scrub: convert scrub_workers_refcnt to refcount_t
Use the refcount_t for fs_info::scrub_workers_refcnt instead of int so
we get the extra checks. All reference changes are still done under
scrub_lock.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-25 14:13:38 +01:00