Commit Graph

886869 Commits

Author SHA1 Message Date
Masami Hiramatsu
5b06eeae52 selftests: breakpoints: Fix a typo of function name
Since commit 5821ba9695 ("selftests: Add test plan API to kselftest.h
and adjust callers") accidentally introduced 'a' typo in the front of
run_test() function, breakpoint_test_arm64.c became not able to be
compiled.

Remove the 'a' from arun_test().

Fixes: 5821ba9695 ("selftests: Add test plan API to kselftest.h and adjust callers")
Reported-by: Jun Takahashi <takahashi.jun_s@aa.socionext.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2019-11-07 14:27:26 -07:00
Taniya Das
eee28109f8 clk: qcom: clk-rpmh: Add support for RPMHCC for SC7180
Add support for clock RPMh driver to vote for ARC and VRM managed
clock resources.

Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lkml.kernel.org/r/1572371299-16774-4-git-send-email-tdas@codeaurora.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:20:37 -08:00
Taniya Das
36b355c840 dt-bindings: clock: Introduce RPMHCC bindings for SC7180
Add compatible for SC7180 RPMHCC.

Signed-off-by: Taniya Das <tdas@codeaurora.org>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lkml.kernel.org/r/1572371299-16774-3-git-send-email-tdas@codeaurora.org
[sboyd@kernel.org: Sort compatible list]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:20:37 -08:00
Taniya Das
681a6ad5c0 dt-bindings: clock: Add YAML schemas for the QCOM RPMHCC clock bindings
The RPMHCC clock provider have a bunch of generic properties that
are needed in a device tree. Add a YAML schemas for those.

Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lkml.kernel.org/r/1572371299-16774-2-git-send-email-tdas@codeaurora.org
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:16:37 -08:00
Taniya Das
17269568f7 clk: qcom: Add Global Clock controller (GCC) driver for SC7180
Add support for the global clock controller found on SC7180
based devices. This should allow most non-multimedia device
drivers to probe and control their clocks.

Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lkml.kernel.org/r/20191014102308.27441-6-tdas@codeaurora.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:16:01 -08:00
Taniya Das
8b9e0562f3 dt-bindings: clock: Add sc7180 GCC clock binding
Add device tree bindings for global clock subsystem clock
controller for Qualcomm Technology Inc's SC7180 SoCs.

Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lkml.kernel.org/r/20191014102308.27441-5-tdas@codeaurora.org
Reviewed-by: Rob Herring <robh@kernel.org>
[sboyd@kernel.org: Reword subject to make sc7180 specific, sort
compatible]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:15:35 -08:00
Taniya Das
9de7269e97 dt-bindings: clock: Add YAML schemas for the QCOM GCC clock bindings
The GCC clock provider have a bunch of generic properties that
are needed in a device tree. Add a YAML schemas for those.

Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lkml.kernel.org/r/20191014102308.27441-4-tdas@codeaurora.org
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:10:44 -08:00
Taniya Das
ffe37ede0a clk: qcom: common: Return NULL from clk_hw OF provider
Return NULL in the cases where the clk_hw is not registered with the
clock provider, but the clock consumer still requests for a clock id.

Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lkml.kernel.org/r/20191014102308.27441-3-tdas@codeaurora.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:10:44 -08:00
Taniya Das
1a1c78217a clk: qcom: rcg: update the DFS macro for RCG
Update the init data name for each of the dynamic frequency switch
controlled clock associated with the RCG clock name, so that it can be
generated as per the hardware plan. Thus update the macro accordingly.

Signed-off-by: Taniya Das <tdas@codeaurora.org>
Link: https://lkml.kernel.org/r/20191014102308.27441-2-tdas@codeaurora.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:10:44 -08:00
YueHaibing
57b2364d0e clk: qcom: remove unneeded semicolon
remove unneeded semicolon.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lkml.kernel.org/r/20191025093332.27592-1-yuehaibing@huawei.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:10:44 -08:00
Govind Singh
6cdef2738d clk: qcom: Add Q6SSTOP clock controller for QCS404
Add support for the Q6SSTOP clock control used on qcs404
based devices. This would allow wcss remoteproc driver to
control the required WCSS Q6SSTOP clock/reset controls to
bring the subsystem out of reset and shutdown the WCSS Q6DSP.

Signed-off-by: Govind Singh <govinds@codeaurora.org>
Link: https://lkml.kernel.org/r/20191011132928.9388-3-govinds@codeaurora.org
[sboyd@kernel.org: Sort makefile]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-11-07 13:10:36 -08:00
Darrick J. Wong
d6abecb825 xfs: range check ri_cnt when recovering log items
Range check the region counter when we're reassembling regions from log
items during log recovery.  In the old days ASSERT would halt the
kernel, but this isn't true any more so we have to make an explicit
error return.

Coverity-id: 1132508
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-07 13:00:54 -08:00
Darrick J. Wong
120254608f xfs: "optimize" buffer item log segment bitmap setting
Optimize the setting of full words of bits in xfs_buf_item_log_segment.
The optimization is purely within the bug triage process.  No functional
changes.

Coverity-id: 1446793
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-07 13:00:54 -08:00
Darrick J. Wong
f5be08446e xfs: null out bma->prev if no previous extent
Coverity complains that we don't check the return value of
xfs_iext_peek_prev_extent like we do nearly all of the time.  If there
is no previous extent then just null out bma->prev like we do elsewhere
in the bmap code.

Coverity-id: 1424057
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-07 13:00:54 -08:00
Darrick J. Wong
5f213ddbcb xfs: fix missing header includes
Some of the xfs source files are missing header includes, so add them
back.  Sparse complains about non-static functions that don't have a
forward declaration anywhere.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-07 13:00:53 -08:00
Darrick J. Wong
5d1116d4c6 xfs: periodically yield scrub threads to the scheduler
Christoph Hellwig complained about the following soft lockup warning
when running scrub after generic/175 when preemption is disabled and
slub debugging is enabled:

watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [xfs_scrub:161]
Modules linked in:
irq event stamp: 41692326
hardirqs last  enabled at (41692325): [<ffffffff8232c3b7>] _raw_0
hardirqs last disabled at (41692326): [<ffffffff81001c5a>] trace0
softirqs last  enabled at (41684994): [<ffffffff8260031f>] __do_e
softirqs last disabled at (41684987): [<ffffffff81127d8c>] irq_e0
CPU: 3 PID: 16189 Comm: xfs_scrub Not tainted 5.4.0-rc3+ #30
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.124
RIP: 0010:_raw_spin_unlock_irqrestore+0x39/0x40
Code: 89 f3 be 01 00 00 00 e8 d5 3a e5 fe 48 89 ef e8 ed 87 e5 f2
RSP: 0018:ffffc9000233f970 EFLAGS: 00000286 ORIG_RAX: ffffffffff3
RAX: ffff88813b398040 RBX: 0000000000000286 RCX: 0000000000000006
RDX: 0000000000000006 RSI: ffff88813b3988c0 RDI: ffff88813b398040
RBP: ffff888137958640 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00042b0c00
R13: 0000000000000001 R14: ffff88810ac32308 R15: ffff8881376fc040
FS:  00007f6113dea700(0000) GS:ffff88813bb80000(0000) knlGS:00000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6113de8ff8 CR3: 000000012f290000 CR4: 00000000000006e0
Call Trace:
 free_debug_processing+0x1dd/0x240
 __slab_free+0x231/0x410
 kmem_cache_free+0x30e/0x360
 xchk_ag_btcur_free+0x76/0xb0
 xchk_ag_free+0x10/0x80
 xchk_bmap_iextent_xref.isra.14+0xd9/0x120
 xchk_bmap_iextent+0x187/0x210
 xchk_bmap+0x2e0/0x3b0
 xfs_scrub_metadata+0x2e7/0x500
 xfs_ioc_scrub_metadata+0x4a/0xa0
 xfs_file_ioctl+0x58a/0xcd0
 do_vfs_ioctl+0xa0/0x6f0
 ksys_ioctl+0x5b/0x90
 __x64_sys_ioctl+0x11/0x20
 do_syscall_64+0x4b/0x1a0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

If preemption is disabled, all metadata buffers needed to perform the
scrub are already in memory, and there are a lot of records to check,
it's possible that the scrub thread will run for an extended period of
time without sleeping for IO or any other reason.  Then the watchdog
timer or the RCU stall timeout can trigger, producing the backtrace
above.

To fix this problem, call cond_resched() from the scrub thread so that
we back out to the scheduler whenever necessary.

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-07 13:00:53 -08:00
Miles Chen
88288ed050 docs: printk-formats: add ptrdiff_t type to printk-formats
When print the difference between two pointers, we should use
the ptrdiff_t modifier %t.

Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:26 -07:00
Mike Leach
f0ae2cfae5 coresight: etm4x: docs: Adds detailed document for programming etm4x.
Add in detailed programmers reference for users wanting to program the
CoreSight ETM 4.x driver using sysfs.

Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:26 -07:00
Mike Leach
8adf42e293 coresight: docs: Create common sub-directory for coresight trace.
There are two files in the Documentation/trace directory relating to
coresight, with more to follow, so create a Documentation/trace/coresight
directory and move existing files there. Fixup index to reference
new location.

Update MAINTAINERS to reference this sub-directory rather than the
individual files.

Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:26 -07:00
Mike Leach
b3ef0df181 coresight: etm4x: docs: Update ABI doc for new sysfs etm4 attributes
This updates the ABI document to reflect recent additions to the ETM4.X
driver sysfs interface.

Signed-off-by: Mike Leach <mike.leach@linaro.org>
[Updated Date and KernelVersion fields]
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:25 -07:00
Mike Leach
5c8fac10c8 coresight: etm4x: docs: Update ABI doc for new sysfs name scheme.
Recent updates to CoreSight drivers have changed the component naming
schema used in sysfs.

This updates the ABI document to reflect the new naming schema.

Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:25 -07:00
Jeff Layton
ff467342d3 Documentation: atomic_open called with shared lock on non-O_CREAT open
The exclusive lock is only held when O_CREAT is set.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:25 -07:00
Konstantin Ryabitsev
e8686a40a3 docs: process: Add base-commit trailer usage
One of the recurring complaints from both maintainers and CI system
operators is that performing git-am on received patches is difficult
without knowing the parent object in the git history on which the
patches are based. Without this information, there is a high likelihood
that git-am will fail due to conflicts, which is particularly
frustrating to CI operators.

Git versions starting with v2.9.0 are able to automatically include
base-commit information using the --base flag of git-format-patch.
Document this usage in process/submitting-patches, and add the rationale
for its inclusion, plus instructions for those not using git on where
the "base-commit:" trailer should go.

Signed-off-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:25 -07:00
Changbin Du
36bc683dde kernel-doc: rename the kernel-doc directive 'functions' to 'identifiers'
The 'functions' directive is not only for functions, but also works for
structs/unions. So the name is misleading. This patch renames it to
'identifiers', which specific the functions/types to be included in
documentation. We keep the old name as an alias of the new one before
all documentation are updated.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:25 -07:00
Masanari Iida
e80d89380c docs: admin-guide: Remove threads-max auto-tuning
Since following path was merged in 5.4-rc3,
auto-tuning feature in threads-max does not exist any more.
Fix the admin-guide document as is.

kernel/sysctl.c: do not override max_threads provided by userspace
b0f53dbc4b

Fixes: b0f53dbc4b ("kernel/sysctl.c: do not override max_threads provided by userspace")
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:25 -07:00
Masanari Iida
73eb802ad9 docs: admin-guide: Fix min value of threads-max in kernel.rst
Since following patch was merged 5.4-rc3, minimum value for
threads-max changed to 1.

kernel/sysctl.c: do not override max_threads provided by userspace
b0f53dbc4b

Fixes: b0f53dbc4b ("kernel/sysctl.c: do not override max_threads provided by userspace")
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:24 -07:00
Louis Taylor
0d0da9aa03 scripts/sphinx-pre-install: fix Arch latexmk dependency
On Arch Linux, latexmk is installed in the texlive-core package.

Signed-off-by: Louis Taylor <louis@kragniz.eu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:24 -07:00
Louis Taylor
67dd7d87d4 docs: driver-api: make interconnect title quieter
This makes it consistent with the other headings in the Linux driver
implementer's API guide.

Signed-off-by: Louis Taylor <louis@kragniz.eu>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:24 -07:00
Jonathan Neuschäfer
43756e347f scripts/kernel-doc: Add support for named variable macro arguments
Currently, when kernel-doc encounters a macro with a named variable
argument[1], such as this:

   #define hlist_for_each_entry_rcu(pos, head, member, cond...)

... it expects the variable argument to be documented as `cond...`,
rather than `cond`. This is semantically wrong, because the name (as
used in the macro body) is actually `cond`.

With this patch, kernel-doc will accept the name without dots (`cond`
in the example above) in doc comments, and warn if the name with dots
(`cond...`) is used and verbose mode[2] is enabled.

The support for the `cond...` syntax can be removed later, when the
documentation of all such macros has been switched to the new syntax.

Testing this patch on top of v5.4-rc6, `make htmldocs` shows a few
changes in log output and HTML output:

 1) The following warnings[3] are eliminated:

   ./include/linux/rculist.h:374: warning:
        Excess function parameter 'cond' description in 'list_for_each_entry_rcu'
   ./include/linux/rculist.h:651: warning:
        Excess function parameter 'cond' description in 'hlist_for_each_entry_rcu'

 2) For list_for_each_entry_rcu and hlist_for_each_entry_rcu, the
    correct description is shown

 3) Named variable arguments are shown without dots

[1]: https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html
[2]: scripts/kernel-doc -v
[3]: See also https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?h=dev&id=5bc4bc0d6153617eabde275285b7b5a8137fdf3c

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-11-07 13:17:24 -07:00
Andreas Gruenbacher
184b4e6085 gfs2: Fix end-of-file handling in gfs2_page_mkwrite
When the filesystem block size is smaller than the page size, the last
page may contain blocks that lie entirely beyond the end of the file.
Make sure to only allocate blocks that lie at least partially in the
file.  Allocating blocks beyond that isn't useful, and what's more, they
will not be zeroed out and may end up containing random data.

With that change in place, make sure we'll still always unstuff stuffed
inodes: iomap_writepage and iomap_writepages currently can't handle
stuffed files.

In addition, simplify and move the end-of-file check further to the top
in gfs2_page_mkwrite to avoid weird side effects like unstuffing when
we're not.

Fixes xfstest generic/263.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-11-07 21:02:35 +01:00
Andreas Gruenbacher
f53056c430 gfs2: Multi-block allocations in gfs2_page_mkwrite
In gfs2_page_mkwrite's gfs2_allocate_page_backing helper, try to
allocate as many blocks at once as we need.  Pass in the size of the
requested allocation.

Fixes: 35af80aef9 ("gfs2: don't use buffer_heads in gfs2_allocate_page_backing")
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-11-07 21:02:02 +01:00
Andreas Gruenbacher
39c3a948ec gfs2: Improve mmap write vs. punch_hole consistency
When punching a hole in a file, use filemap_write_and_wait_range to
write back any dirty pages in the range of the hole.  As a side effect,
if the hole isn't page aligned, this marks unaligned pages at the
beginning and the end of the hole read-only.  This is required when the
block size is smaller than the page size: when those pages are written
to again after the hole punching, we must make sure that page_mkwrite is
called for those pages so that the page will be fully allocated and any
blocks turned into holes from the hole punching will be reallocated.
(If a page is writably mapped, page_mkwrite won't be called.)

Fixes xfstest generic/567.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-11-07 21:01:04 +01:00
Linus Torvalds
847120f859 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
 "Two fixes for the HID subsystem:

   - regression fix for i2c-hid power management (Hans de Goede)

   - signed vs unsigned API fix for Wacom driver (Jason Gerecke)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: wacom: generic: Treat serial number and related fields as unsigned
  HID: i2c-hid: Send power-on command after reset
2019-11-07 11:54:54 -08:00
Tejun Heo
1d156646e0 blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT
blkg_rwstat is now only used by bfq-iosched and blk-throtl when on
cgroup1.  Let's move it into its own files and gate it behind a config
option.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 12:28:13 -07:00
Tejun Heo
f733164829 blk-cgroup: reimplement basic IO stats using cgroup rstat
blk-cgroup has been using blkg_rwstat to track basic IO stats.
Unfortunately, reading recursive stats scales badly as itinvolves
walking all descendants.  On systems with a huge number of cgroups
(dead or alive), this can lead to substantial CPU cost when reading IO
stats.

This patch reimplements basic IO stats using cgroup rstat which uses
more memory but makes recursive stat reading O(# descendants which
have been active since last reading) instead of O(# descendants).

* blk-cgroup core no longer uses sync/async stats.  Introduce new stat
  enums - BLKG_IOSTAT_{READ|WRITE|DISCARD}.

* Add blkg_iostat[_set] which encapsulates byte and io stats, last
  values for propagation delta calculation and u64_stats_sync for
  correctness on 32bit archs.

* Update the new percpu stat counters directly and implement
  blkcg_rstat_flush() to implement propagation.

* blkg_print_stat() can now bring the stats up to date by calling
  cgroup_rstat_flush() and print them instead of directly summing up
  all descendants.

* It now allocates 96 bytes per cpu.  It used to be 40 bytes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Dan Schatzberg <dschatzberg@fb.com>
Cc: Daniel Xu <dlxu@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 12:28:13 -07:00
Tejun Heo
8a80d5d663 blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive()
These don't have users anymore.  Remove them.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 12:28:13 -07:00
Tejun Heo
7ca464383a blk-throtl: stop using blkg->stat_bytes and ->stat_ios
When used on cgroup1, blk-throtl uses the blkg->stat_bytes and
->stat_ios from blk-cgroup core to populate four stat knobs.
blk-cgroup core is moving away from blkg_rwstat to improve scalability
and won't be able to support this usage.

It isn't like the sharing gains all that much.  Let's break them out
to dedicated rwstat counters which are updated when on cgroup1.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 12:28:13 -07:00
Tejun Heo
fd41e60331 bfq-iosched: stop using blkg->stat_bytes and ->stat_ios
When used on cgroup1, bfq uses the blkg->stat_bytes and ->stat_ios
from blk-cgroup core to populate six stat knobs.  blk-cgroup core is
moving away from blkg_rwstat to improve scalability and won't be able
to support this usage.

It isn't like the sharing gains all that much.  Let's break it out to
dedicated rwstat counters which are updated when on cgroup1.  This
makes use of bfqg_*rwstat*() helpers outside of
CONFIG_BFQ_CGROUP_DEBUG.  Move them out.

v2: Compile fix when !CONFIG_BFQ_CGROUP_DEBUG.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 12:28:13 -07:00
Tejun Heo
a557f1c7fe bfq-iosched: relocate bfqg_*rwstat*() helpers
Collect them right under #ifdef CONFIG_BFQ_CGROUP_DEBUG.  The next
patch will use them from !DEBUG path and this makes it easy to move
them out of the ifdef block.

This is pure code reorganization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 12:28:13 -07:00
Jens Axboe
912c0a8591 Merge branch 'for-linus' into for-5.5/block
Pull on for-linus to resolve what otherwise would have been a conflict
with the cgroups rstat patchset from Tejun.

* for-linus: (942 commits)
  blkcg: make blkcg_print_stat() print stats only for online blkgs
  nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
  nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
  nvme-rdma: fix a segmentation fault during module unload
  iocost: don't nest spin_lock_irq in ioc_weight_write()
  io_uring: ensure we clear io_kiocb->result before each issue
  um-ubd: Entrust re-queue to the upper layers
  nvme-multipath: remove unused groups_only mode in ana log
  nvme-multipath: fix possible io hang after ctrl reconnect
  io_uring: don't touch ctx in setup after ring fd install
  io_uring: Fix leaked shadow_req
  Linux 5.4-rc5
  riscv: cleanup do_trap_break
  nbd: verify socket is supported during setup
  ata: libahci_platform: Fix regulator_get_optional() misuse
  nbd: handle racing with error'ed out commands
  nbd: protect cmd->status with cmd->lock
  io_uring: fix bad inflight accounting for SETUP_IOPOLL|SETUP_SQTHREAD
  io_uring: used cached copies of sq->dropped and cq->overflow
  ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157
  ...
2019-11-07 12:27:19 -07:00
Chao Yu
1f0d5c911b f2fs: fix potential overflow
We expect 64-bit calculation result from below statement, however
in 32-bit machine, looped left shift operation on pgoff_t type
variable may cause overflow issue, fix it by forcing type cast.

page->index << PAGE_SHIFT;

Fixes: 26de9b1171 ("f2fs: avoid unnecessary updating inode during fsync")
Fixes: 0a2aa8fbb9 ("f2fs: refactor __exchange_data_block for speed up")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-11-07 11:17:39 -08:00
Chao Yu
2a60637f06 f2fs: fix to update dir's i_pino during cross_rename
As Eric reported:

RENAME_EXCHANGE support was just added to fsstress in xfstests:

	commit 65dfd40a97b6bbbd2a22538977bab355c5bc0f06
	Author: kaixuxia <xiakaixu1987@gmail.com>
	Date:   Thu Oct 31 14:41:48 2019 +0800

	    fsstress: add EXCHANGE renameat2 support

This is causing xfstest generic/579 to fail due to fsck.f2fs reporting errors.
I'm not sure what the problem is, but it still happens even with all the
fs-verity stuff in the test commented out, so that the test just runs fsstress.

generic/579 23s ... 	[10:02:25]
[    7.745370] run fstests generic/579 at 2019-11-04 10:02:25
_check_generic_filesystem: filesystem on /dev/vdc is inconsistent
(see /results/f2fs/results-default/generic/579.full for details)
 [10:02:47]
Ran: generic/579
Failures: generic/579
Failed 1 of 1 tests
Xunit report: /results/f2fs/results-default/result.xml

Here's the contents of 579.full:

_check_generic_filesystem: filesystem on /dev/vdc is inconsistent
*** fsck.f2fs output ***
[ASSERT] (__chk_dots_dentries:1378)  --> Bad inode number[0x24] for '..', parent parent ino is [0xd10]

The root cause is that we forgot to update directory's i_pino during
cross_rename, fix it.

Fixes: 32f9bc25cb ("f2fs: support ->rename2()")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2019-11-07 11:15:39 -08:00
Martin KaFai Lau
ed5941af3f bpf: Add cb access in kfree_skb test
Access the skb->cb[] in the kfree_skb test.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191107180905.4097871-1-kafai@fb.com
2019-11-07 10:59:08 -08:00
Martin KaFai Lau
7e3617a72d bpf: Add array support to btf_struct_access
This patch adds array support to btf_struct_access().
It supports array of int, array of struct and multidimensional
array.

It also allows using u8[] as a scratch space.  For example,
it allows access the "char cb[48]" with size larger than
the array's element "char".  Another potential use case is
"u64 icsk_ca_priv[]" in the tcp congestion control.

btf_resolve_size() is added to resolve the size of any type.
It will follow the modifier if there is any.  Please
see the function comment for details.

This patch also adds the "off < moff" check at the beginning
of the for loop.  It is to reject cases when "off" is pointing
to a "hole" in a struct.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191107180903.4097702-1-kafai@fb.com
2019-11-07 10:59:08 -08:00
Jens Axboe
5f8fd2d3e0 io_uring: properly mark async work as bounded vs unbounded
Now that io-wq supports separating the two request lifetime types, mark
the following IO as having unbounded runtimes:

- Any read/write to a non-regular file
- Any specific networked IO
- Any poll command

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-07 11:57:17 -07:00
David S. Miller
69625ea7bd Merge branch 'cxgb4-add-support-for-TC-MQPRIO-Qdisc-Offload'
Rahul Lakkireddy says:

====================
cxgb4: add support for TC-MQPRIO Qdisc Offload

This series of patches add support for offloading TC-MQPRIO Qdisc
to Chelsio T5/T6 NICs. Offloading QoS traffic shaping and pacing
requires using Ethernet Offload (ETHOFLD) resources available on
Chelsio NICs. The ETHOFLD resources are configured by firmware
and taken from the resource pool shared with other Chelsio Upper
Layer Drivers. Traffic flowing through ETHOFLD region requires a
software netdev Tx queue (EOSW_TXQ) exposed to networking stack,
and an underlying hardware Tx queue (EOHW_TXQ) used for sending
packets through hardware.

ETHOFLD region is addressed using EOTIDs, which are per-connection
resource. Hence, EOTIDs are capable of storing only a very small
number of packets in flight. To allow more connections to share
the the QoS rate limiting configuration, multiple EOTIDs must be
allocated to reduce packet drops. EOTIDs are 1-to-1 mapped with
software EOSW_TXQ. Several software EOSW_TXQs can post packets to
a single hardware EOHW_TXQ.

The series is broken down as follows:

Patch 1 queries firmware for maximum available traffic classes,
as well as, start and maximum available indices (EOTID) into ETHOFLD
region, supported by the underlying device.

Patch 2 reworks queue configuration and simplifies MSI-X allocation
logic in preparation for ETHOFLD queues support.

Patch 3 adds skeleton for validating and configuring TC-MQPRIO Qdisc
offload. Also, adds support for software EOSW_TXQs and exposes them
to network stack. Updates Tx queue selection to use fallback NIC Tx
path for unsupported traffic that can't go through ETHOFLD queues.

Patch 4 adds support for managing hardware queues to rate limit
traffic flowing through them. The queues are allocated/removed based
on enabling/disabling TC-MQPRIO Qdisc offload, respectively.

Patch 5 adds Tx path for traffic flowing through software EOSW_TXQ
and EOHW_TXQ. Also, adds Rx path to handle Tx completions.

Patch 6 updates exisiting SCHED API to configure FLOWC based QoS
offload. In the existing QUEUE based rate limiting, multiple queues
sharing a traffic class get the aggreagated max rate limit value.
On the other hand, in FLOWC based rate limiting, multiple queues
sharing a traffic class get their own individual max rate limit
value. For example, if 2 queues are bound to class 0, which is rate
limited to 1 Gbps, then in QUEUE based rate limiting, both the
queues get the aggregate max output of 1 Gbps only. In FLOWC based
rate limiting, each queue gets its own output of max 1 Gbps each;
i.e. 2 queues * 1 Gbps rate limit = 2 Gbps max output.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 10:41:59 -08:00
Rahul Lakkireddy
0e395b3cb1 cxgb4: add FLOWC based QoS offload
Rework SCHED API to allow offloading TC-MQPRIO QoS configuration.
The existing QUEUE based rate limiting throttles all queues sharing
a traffic class, to the specified max rate limit value. So, if
multiple queues share a traffic class, then all the queues get
the aggregate specified max rate limit.

So, introduce the new FLOWC based rate limiting, where multiple
queues can share a traffic class with each queue getting its own
individual specified max rate limit.

For example, if 2 queues are bound to class 0, which is rate limited
to 1 Gbps, then 2 queues using QUEUE based rate limiting, get the
aggregate output of 1 Gbps only. In FLOWC based rate limiting, each
queue gets its own output of max 1 Gbps each; i.e. 2 queues * 1 Gbps
rate limit = 2 Gbps.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 10:41:59 -08:00
Rahul Lakkireddy
4846d5330d cxgb4: add Tx and Rx path for ETHOFLD traffic
Implement Tx path for traffic flowing through software EOSW_TXQ
and EOHW_TXQ. Since multiple EOSW_TXQ can post packets to a single
EOHW_TXQ, protect the hardware queue with necessary spinlock. Also,
move common code used to generate TSO work request to a common
function.

Implement Rx path to handle Tx completions for successfully
transmitted packets.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 10:41:59 -08:00
Rahul Lakkireddy
2d0cb84dd9 cxgb4: add ETHOFLD hardware queue support
Add support for configuring and managing ETHOFLD hardware queues.
Keep the queue count and MSI-X allocation scheme same as NIC queues.
ETHOFLD hardware queues are dynamically allocated/destroyed as
TC-MQPRIO Qdisc offload is enabled/disabled on the corresponding
interface, respectively.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 10:41:59 -08:00
Rahul Lakkireddy
b1396c2bd6 cxgb4: parse and configure TC-MQPRIO offload
Add logic for validation and configuration of TC-MQPRIO Qdisc
offload. Also, add support to manage EOSW_TXQ, which have 1-to-1
mapping with EOTIDs, and expose them to network stack.

Move common skb validation in Tx path to a separate function and
add minimal Tx path for ETHOFLD. Update Tx queue selection to return
normal NIC Txq to send traffic pattern that can't go through ETHOFLD
Tx path.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 10:41:59 -08:00