The ENETC hardware support the Credit Based Shaper(CBS) which part
of the IEEE-802.1Qav. The CBS driver was loaded by the sch_cbs
interface when set in the QOS in the kernel.
Here is an example command to set 20Mbits bandwidth in 1Gbits port
for taffic class 7:
tc qdisc add dev eth0 root handle 1: mqprio \
num_tc 8 map 0 1 2 3 4 5 6 7 hw 1
tc qdisc replace dev eth0 parent 1:8 cbs \
locredit -1470 hicredit 30 \
sendslope -980000 idleslope 20000 offload 1
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add helpers to make locking/unlocking the MDIO bus easier.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven reported that using devm_reset_controller_get leads
to a WARNING when probing a reset-controlled PHY. This is because the
device devm_reset_controller_get gets supplied is not actually the
one being probed.
Acquire an unmanaged reset-control as well as free the reset_control on
unregister to fix this.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
CC: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David Bauer <mail@david-bauer.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull io_uring updates from Jens Axboe:
"A lot of stuff has been going on this cycle, with improving the
support for networked IO (and hence unbounded request completion
times) being one of the major themes. There's been a set of fixes done
this week, I'll send those out as well once we're certain we're fully
happy with them.
This contains:
- Unification of the "normal" submit path and the SQPOLL path (Pavel)
- Support for sparse (and bigger) file sets, and updating of those
file sets without needing to unregister/register again.
- Independently sized CQ ring, instead of just making it always 2x
the SQ ring size. This makes it more flexible for networked
applications.
- Support for overflowed CQ ring, never dropping events but providing
backpressure on submits.
- Add support for absolute timeouts, not just relative ones.
- Support for generic cancellations. This divorces io_uring from
workqueues as well, which additionally gets us one step closer to
generic async system call support.
- With cancellations, we can support grabbing the process file table
as well, just like we do mm context. This allows support for system
calls that create file descriptors, like accept4() support that's
built on top of that.
- Support for io_uring tracing (Dmitrii)
- Support for linked timeouts. These abort an operation if it isn't
completed by the time noted in the linke timeout.
- Speedup tracking of poll requests
- Various cleanups making the coder easier to follow (Jackie, Pavel,
Bob, YueHaibing, me)
- Update MAINTAINERS with new io_uring list"
* tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block: (64 commits)
io_uring: make POLL_ADD/POLL_REMOVE scale better
io-wq: remove now redundant struct io_wq_nulls_list
io_uring: Fix getting file for non-fd opcodes
io_uring: introduce req_need_defer()
io_uring: clean up io_uring_cancel_files()
io-wq: ensure free/busy list browsing see all items
io-wq: ensure we have a stable view of ->cur_work for cancellations
io_wq: add get/put_work handlers to io_wq_create()
io_uring: check for validity of ->rings in teardown
io_uring: fix potential deadlock in io_poll_wake()
io_uring: use correct "is IO worker" helper
io_uring: fix -ENOENT issue with linked timer with short timeout
io_uring: don't do flush cancel under inflight_lock
io_uring: flag SQPOLL busy condition to userspace
io_uring: make ASYNC_CANCEL work with poll and timeout
io_uring: provide fallback request for OOM situations
io_uring: convert accept4() -ERESTARTSYS into -EINTR
io_uring: fix error clear of ->file_table in io_sqe_files_register()
io_uring: separate the io_free_req and io_free_req_find_next interface
io_uring: keep io_put_req only responsible for release and put req
...
Pull tpmd updates from Jarkko Sakkinen:
- support for Cr50 fTPM
- support for fTPM on AMD Zen+ CPUs
- TPM 2.0 trusted keys code relocated from drivers/char/tpm to
security/keys
* tag 'tpmdd-next-20191112' of git://git.infradead.org/users/jjs/linux-tpmdd:
KEYS: trusted: Remove set but not used variable 'keyhndl'
tpm: Switch to platform_get_irq_optional()
tpm_crb: fix fTPM on AMD Zen+ CPUs
KEYS: trusted: Move TPM2 trusted keys code
KEYS: trusted: Create trusted keys subsystem
KEYS: Use common tpm_buf for trusted and asymmetric keys
tpm: Move tpm_buf code to include/linux/
tpm: use GFP_KERNEL instead of GFP_HIGHMEM for tpm_buf
tpm: add check after commands attribs tab allocation
tpm: tpm_tis_spi: Drop THIS_MODULE usage from driver struct
tpm: tpm_tis_spi: Cleanup includes
tpm: tpm_tis_spi: Support cr50 devices
tpm: tpm_tis_spi: Introduce a flow control callback
tpm: Add a flag to indicate TPM power is managed by firmware
dt-bindings: tpm: document properties for cr50
tpm_tis: override durations for STM tpm with firmware 1.2.8.28
tpm: provide a way to override the chip returned durations
tpm: Remove duplicate code from caps_show() in tpm-sysfs.c
fdget_pos() is used by file operations that will read and update f_pos:
things like "read()", "write()" and "lseek()" (but not, for example,
"pread()/pwrite" that get their file positions elsewhere).
However, it had two separate escape clauses for this, because not
everybody wants or needs serialization of the file position.
The first and most obvious case is the "file descriptor doesn't have a
position at all", ie a stream-like file. Except we didn't actually use
FMODE_STREAM, but instead used FMODE_ATOMIC_POS. The reason for that
was that FMODE_STREAM didn't exist back in the days, but also that we
didn't want to mark all the special cases, so we only marked the ones
that _required_ position atomicity according to POSIX - regular files
and directories.
The case one was intentionally lazy, but now that we _do_ have
FMODE_STREAM we could and should just use it. With the change to use
FMODE_STREAM, there are no remaining uses for FMODE_ATOMIC_POS, and all
the code to set it is deleted.
Any cases where we don't want the serialization because the driver (or
subsystem) doesn't use the file position should just be updated to do
"stream_open()". We've done that for all the obvious and common
situations, we may need a few more. Quoting Kirill Smelkov in the
original FMODE_STREAM thread (see link below for full email):
"And I appreciate if people could help at least somehow with "getting
rid of mixed case entirely" (i.e. always lock f_pos_lock on
!FMODE_STREAM), because this transition starts to diverge from my
particular use-case too far. To me it makes sense to do that
transition as follows:
- convert nonseekable_open -> stream_open via stream_open.cocci;
- audit other nonseekable_open calls and convert left users that
truly don't depend on position to stream_open;
- extend stream_open.cocci to analyze alloc_file_pseudo as well (this
will cover pipes and sockets), or maybe convert pipes and sockets
to FMODE_STREAM manually;
- extend stream_open.cocci to analyze file_operations that use
no_llseek or noop_llseek, but do not use nonseekable_open or
alloc_file_pseudo. This might find files that have stream semantic
but are opened differently;
- extend stream_open.cocci to analyze file_operations whose
.read/.write do not use ppos at all (independently of how file was
opened);
- ...
- after that remove FMODE_ATOMIC_POS and always take f_pos_lock if
!FMODE_STREAM;
- gather bug reports for deadlocked read/write and convert missed
cases to FMODE_STREAM, probably extending stream_open.cocci along
the road to catch similar cases
i.e. always take f_pos_lock unless a file is explicitly marked as
being stream, and try to find and cover all files that are streams"
We have not done the "extend stream_open.cocci to analyze
alloc_file_pseudo" as well, but the previous commit did manually handle
the case of pipes and sockets.
The other case where we can avoid locking f_pos is the "this file
descriptor only has a single user and it is us, and thus there is no
need to lock it".
The second test was correct, although a bit subtle and worth just
re-iterating here. There are two kinds of other sources of references
to the same file descriptor: file descriptors that have been explicitly
shared across fork() or with dup(), and file tables having elevated
reference counts due to threading (or explicit file sharing with
clone()).
The first case would have incremented the file count explicitly, and in
the second case the previous __fdget() would have incremented it for us
and set the FDPUT_FPUT flag.
But in both cases the file count would be greater than one, so the
"file_count(file) > 1" test catches both situations. Also note that if
file_count is 1, that also means that no other thread can have access to
the file table, so there also cannot be races with concurrent calls to
dup()/fork()/clone() that would increment the file count any other way.
Link: https://lore.kernel.org/linux-fsdevel/20190413184404.GA13490@deco.navytux.spb.ru
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Eic Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Marco Elver <elver@google.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Paul McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We must stop GC, once the segment becomes fully valid. Otherwise, it can
produce another dirty segments by moving valid blocks in the segment partially.
Ramon hit no free segment panic sometimes and saw this case happens when
validating reliable file pinning feature.
Signed-off-by: Ramon Pantin <pantin@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Expose in /sys/fs/f2fs/<blockdev>/main_blkaddr the block address where the
main area starts. This allows user mode programs to determine:
- That pinned files that are made exclusively of fully allocated 2MB
segments will never be unpinned by the file system.
- Where the main area starts. This is required by programs that want to
verify if a file is made exclusively of 2MB f2fs segments, the alignment
boundary for segments starts at this address. Testing for 2MB alignment
relative to the start of the device is incorrect, because for some
filesystems main_blkaddr is not at a 2MB boundary relative to the start
of the device.
The entry will be used when validating reliable pinning file feature proposed
by "f2fs: support aligned pinned file".
Signed-off-by: Ramon Pantin <pantin@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Setting softlimit larger than hardlimit seems meaningless
for disk quota but currently it is allowed. In this case,
there may be a bit of comfusion for users when they run
df comamnd to directory which has project quota.
For example, we set 20M softlimit and 10M hardlimit of
block usage limit for project quota of test_dir(project id 123).
[root@hades f2fs]# repquota -P -a
*** Report for project quotas on device /dev/nvme0n1p8
Block grace time: 7days; Inode grace time: 7days
Block limits File limits
Project used soft hard grace used soft hard grace
----------------------------------------------------------------------
0 -- 4 0 0 1 0 0
123 +- 10248 20480 10240 2 0 0
The result of df command as below:
[root@hades f2fs]# df -h /mnt/f2fs/test
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p8 20M 11M 10M 51% /mnt/f2fs
Even though it looks like there is another 10M free space to use,
if we write new data to diretory test(inherit project id),
the write will fail with errno(-EDQUOT).
After this patch, the df result looks like below.
[root@hades f2fs]# df -h /mnt/f2fs/test
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p8 10M 10M 0 100% /mnt/f2fs
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In commit 3975b097e5 ("convert stream-like files -> stream_open, even
if they use noop_llseek") Kirill used a coccinelle script to change
"nonseekable_open()" to "stream_open()", which changed the trivial cases
of stream-like file descriptors to the new model with FMODE_STREAM.
However, the two big cases - sockets and pipes - don't actually have
that trivial pattern at all, and were thus never converted to
FMODE_STREAM even though it makes lots of sense to do so.
That's particularly true when looking forward to the next change:
getting rid of FMODE_ATOMIC_POS entirely, and just using FMODE_STREAM to
decide whether f_pos updates are needed or not. And if they are, we'll
always do them atomically.
This came up because KCSAN (correctly) noted that the non-locked f_pos
updates are data races: they are clearly benign for the case where we
don't care, but it would be good to just not have that issue exist at
all.
Note that the reason we used FMODE_ATOMIC_POS originally is that only
doing it for the minimal required case is "safer" in that it's possible
that the f_pos locking can cause unnecessary serialization across the
whole write() call. And in the worst case, that kind of serialization
can cause deadlock issues: think writers that need readers to empty the
state using the same file descriptor.
[ Note that the locking is per-file descriptor - because it protects
"f_pos", which is obviously per-file descriptor - so it only affects
cases where you literally use the same file descriptor to both read
and write.
So a regular pipe that has separate reading and writing file
descriptors doesn't really have this situation even though it's the
obvious case of "reader empties what a bit writer concurrently fills"
But we want to make pipes as being stream-line anyway, because we
don't want the unnecessary overhead of locking, and because a named
pipe can be (ab-)used by reading and writing to the same file
descriptor. ]
There are likely a lot of other cases that might want FMODE_STREAM, and
looking for ".llseek = no_llseek" users and other cases that don't have
an lseek file operation at all and making them use "stream_open()" might
be a good idea. But pipes and sockets are likely to be the two main
cases.
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Eic Dumazet <edumazet@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Marco Elver <elver@google.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Paul McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
$ sed -e 's/^ /\t/' -i */Kconfig
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Getting the same alert twice in a row is legal and normal,
especially on a fast device (like running in qemu). Kind of
like interrupts. So don't report duplicate alerts, and deliver
them normally.
[JD: Fixed subject]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Update signing key of first channel whenever generating the master
sigining/encryption/decryption keys rather than only in cifs_mount().
This also fixes reconnect when re-establishing smb sessions to other
servers.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The commit f05499a06f ("writeback: use ino_t for inodes in
tracepoints") introduced a lot of GCC compilation warnings on s390,
In file included from ./include/trace/define_trace.h:102,
from ./include/trace/events/writeback.h:904,
from fs/fs-writeback.c:82:
./include/trace/events/writeback.h: In function
'trace_raw_output_writeback_page_template':
./include/trace/events/writeback.h:76:12: warning: format '%lu' expects
argument of type 'long unsigned int', but argument 4 has type 'ino_t'
{aka 'unsigned int'} [-Wformat=]
TP_printk("bdi %s: ino=%lu index=%lu",
^~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:360:22: note: in definition of macro
'DECLARE_EVENT_CLASS'
trace_seq_printf(s, print); \
^~~~~
./include/trace/events/writeback.h:76:2: note: in expansion of macro
'TP_printk'
TP_printk("bdi %s: ino=%lu index=%lu",
^~~~~~~~~
Fix them by adding necessary casts where ino_t could be either "unsigned
int" or "unsigned long".
Fixes: f05499a06f ("writeback: use ino_t for inodes in tracepoints")
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Tejun Heo <tj@kernel.org>
The Scarlett 6i6 has no padding on rear inputs 3/4 but a gainstage.
This patch introduces this functionality as to be seen in the mac
or windows scarlett control.
The correct address could already be found in the dump info, but was
never used. Without this patch inputs 3/4 are quite unusable else.
Signed-off-by: Jens Verwiebe <info@jensverwiebe.de>
Link: https://lore.kernel.org/r/384d65cd-5e87-91eb-9fc3-e57226f534c6@jensverwiebe.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Patch series from Dan Williams:
At last years Plumbers Conference I proposed the Maintainer Entry
Profile as a document that a maintainer can provide to set contributor
expectations and provide fodder for a discussion between maintainers
about the merits of different maintainer policies.
For those that did not attend, the goal of the Maintainer Entry Profile
is to provide contributors documentation of patch submission
considerations that may vary by subsystem. The session introduction was:
The first rule of kernel maintenance is that there are no hard and
fast rules. That state of affairs is both a blessing and a curse. It
has served the community well to be adaptable to the different
people and different problem spaces that inhabit the kernel
community. However, that variability also leads to inconsistent
experiences for contributors, little to no guidance for new
contributors, and unnecessary stress on current maintainers.
To be clear, the proposed document does not impose or suggest new rules.
Instead it provides an outlet to document the existing unwritten
policies in effect for a given subsystem. Over time the hope is that
some of this variability can be up-levelled to new global process
policy, but in the meantime it provides relief for communicating the
guidelines that are being imposed on contributors.
[jc: resolved merge conflicts with the MAINTAINERS file, added a patch
to fix up various RST issues, and added a TOC section for the
profiles.]
Add blank lines where needed to get the document to render properly. Also
add a TOC of existing profiles just so that the nvdimm profile is linked
into the toctree, is discoverable, and doesn't generate a warning.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Looks like we've had the sgx sysconfig register and revision register
always wrong for omap4, including the old platform data. Let's fix the
offsets to what the TRM says. Otherwise the sgx module may never idle
depending on the state of the real sysconfig register.
Fixes: d23a163ebe ("ARM: dts: Add nodes for missing omap4 interconnect target modules")
Cc: H. Nikolaus Schaller <hns@goldelico.com>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
As presented at the 2018 Linux Plumbers conference [1], the Maintainer
Entry Profile (formerly Subsystem Profile) is proposed as a way to reduce
friction between committers and maintainers and encourage conversations
amongst maintainers about common best practices. While coding-style,
submit-checklist, and submitting-drivers lay out some common expectations
there remain local customs and maintainer preferences that vary by
subsystem.
The profile contains documentation of some of the common policy
questions a contributor might have that are local to the subsystem /
device-driver, special considerations for the subsystem, or other
guidelines that are otherwise not covered by the top-level process
documents.
The initial and hopefully non-controversial headings in the profile are:
Overview:
General introduction to how the subsystem operates
Submit Checklist Addendum:
Mechanical items that gate submission staging, or other requirements
that gate patch acceptance.
Key Cycle Dates:
- Last -rc for new feature submissions: Expected lead time for submissions
- Last -rc to merge features: Deadline for merge decisions
Resubmit Cadence: When and preferred method to follow up with the
maintainer
Note that coding style guidelines are explicitly left out of this list.
See Documentation/maintainer/maintainer-entry-profile.rst for more details,
and a follow-on example profile for the libnvdimm subsystem.
[1]: https://linuxplumbersconf.org/event/2/contributions/59/
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Steve French <stfrench@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tobin C. Harding <me@tobin.cc>
Cc: Olof Johansson <olof@lixom.net>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/157462919309.1729495.10585699280061787229.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
We used to skip reconnects on all SMB2_IOCTL commands due to SMB3+
FSCTL_VALIDATE_NEGOTIATE_INFO - which made sense since we're still
establishing a SMB session.
However, when refresh_cache_worker() calls smb2_get_dfs_refer() and
we're under reconnect, SMB2_ioctl() will not be able to get a proper
status error (e.g. -EHOSTDOWN in case we failed to reconnect) but an
-EAGAIN from cifs_send_recv() thus looping forever in
refresh_cache_worker().
Fixes: e99c63e4d8 ("SMB3: Fix deadlock in validate negotiate hits reconnect")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Suggested-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
We don't care about module aliasing validation in
cifs_compose_mount_options(..., is_smb3) when finding the root SMB
session of an DFS namespace in order to refresh DFS referral cache.
The following issue has been observed when mounting with '-t smb3' and
then specifying 'vers=2.0':
...
Nov 08 15:27:08 tw kernel: address conversion returned 0 for FS0.WIN.LOCAL
Nov 08 15:27:08 tw kernel: [kworke] ==> dns_query((null),FS0.WIN.LOCAL,13,(null))
Nov 08 15:27:08 tw kernel: [kworke] call request_key(,FS0.WIN.LOCAL,)
Nov 08 15:27:08 tw kernel: [kworke] ==> dns_resolver_cmp(FS0.WIN.LOCAL,FS0.WIN.LOCAL)
Nov 08 15:27:08 tw kernel: [kworke] <== dns_resolver_cmp() = 1
Nov 08 15:27:08 tw kernel: [kworke] <== dns_query() = 13
Nov 08 15:27:08 tw kernel: fs/cifs/dns_resolve.c: dns_resolve_server_name_to_ip: resolved: FS0.WIN.LOCAL to 192.168.30.26
===> Nov 08 15:27:08 tw kernel: CIFS VFS: vers=2.0 not permitted when mounting with smb3
Nov 08 15:27:08 tw kernel: fs/cifs/dfs_cache.c: CIFS VFS: leaving refresh_tcon (xid = 26) rc = -22
...
Fixes: 5072010ccf ("cifs: Fix DFS cache refresher for DFS links")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Fix a typo in the free slave id search loop. Instead of I2C_CLIENT_PEC,
it should have been I2C_CLIENT_TEN. The slave id 1 can only handle 7-bit
addresses and thus is not eligible in case of 10-bit addresses.
As a matter of fact none of the slave id support I2C_CLIENT_PEC, overall
check is performed at the beginning of the stm32f7_i2c_reg_slave function.
Fixes: 60d609f30d ("i2c: i2c-stm32f7: Add slave support")
Signed-off-by: Alain Volmat <alain.volmat@st.com>
Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The major drawback of commit 7e34f4e4aa ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.
As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster... Hopefully with
commensurate power saving!
v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aa ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f8747f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In the next patch, we will introduce a new asynchronous retirement
worker, fed by execlists CS events. Here we may queue a retirement as
soon as a request is submitted to HW (and completes instantly), and we
also want to process that retirement as early as possible and cannot
afford to postpone (as there may not be another opportunity to retire it
for a few seconds). To allow the new async retirer to run in parallel
with our submission, pull the __i915_request_queue (that passes the
request to HW) inside the timelines spinlock so that the retirement
cannot release the timeline before we have completed the submission.
v2: Actually to play nicely with engine_retire, we have to raise the
timeline.active_lock before releasing the HW. intel_gt_retire_requsts()
is still serialised by the outer lock so they cannot see this
intermediate state, and engine_retire is serialised by HW submission.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-2-chris@chris-wilson.co.uk
(cherry picked from commit 88a4655e75)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since we want to do a lockless read of the current active request, and
that request is written to by process_csb also without serialisation, we
need to instruct gcc to take care in reading the pointer itself.
Otherwise, we have observed execlists_active() to report 0x40.
[ 2400.760381] igt/para-4098 1..s. 2376479300us : process_csb: rcs0 cs-irq head=3, tail=4
[ 2400.760826] igt/para-4098 1..s. 2376479303us : process_csb: rcs0 csb[4]: status=0x00000001:0x00000000
[ 2400.761271] igt/para-4098 1..s. 2376479306us : trace_ports: rcs0: promote { b9c59:2622, b9c55:2624 }
[ 2400.761726] igt/para-4097 0d... 2376479311us : __i915_schedule: rcs0: -2147483648->3, inflight:0000000000000040, rq:ffff888208c1e940
which is impossible!
The answer is that as we keep the existing execlists->active pointing
into the array as we copy over that array, the unserialised read may see
a partial pointer value.
Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125094318.1630806-1-chris@chris-wilson.co.uk
(cherry picked from commit 331bf90591)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Danit Goldberg says:
====================
This series extends RTNETLINK to provide IB port and node GUIDs, which
were configured for Infiniband VFs.
The functionality to set VF GUIDs already existed for a long time, and
here we are adding the missing "get" so that netlink will be symmetric and
various cloud orchestration tools will be able to manage such VFs more
naturally.
The iproute2 was extended too to present those GUIDs.
- ip link show <device>
For example:
- ip link set ib4 vf 0 node_guid 22:44:33:00:33:11:00:33
- ip link set ib4 vf 0 port_guid 10:21:33:12:00:11:22:10
- ip link show ib4
ib4: <BROADCAST,MULTICAST> mtu 4092 qdisc noop state DOWN mode DEFAULT group default qlen 256
link/infiniband 00:00:0a:2d:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:44:36:8d brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 link/infiniband 00:00:0a:2d:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:44:36:8d brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff,
spoof checking off, NODE_GUID 22:44:33:00:33:11:00:33, PORT_GUID 10:21:33:12:00:11:22:10, link-state disable, trust off, query_rss off
====================
Based on the mlx5-next branch from
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for
dependencies
* branch 'ib-guids': (35 commits)
IB/mlx5: Implement callbacks for getting VFs GUID attributes
IB/ipoib: Add ndo operation for getting VFs GUID attributes
IB/core: Add interfaces to get VF node and port GUIDs
net/core: Add support for getting VF GUIDs
net/mlx5: Add new chain for netfilter flow table offload
net/mlx5: Refactor creating fast path prio chains
net/mlx5: Accumulate levels for chains prio namespaces
net/mlx5: Define fdb tc levels per prio
net/mlx5: Rename FDB_* tc related defines to FDB_TC_* defines
net/mlx5: Simplify fdb chain and prio eswitch defines
IB/mlx5: Load profile according to RoCE enablement state
IB/mlx5: Rename profile and init methods
net/mlx5: Handle "enable_roce" devlink param
net/mlx5: Document flow_steering_mode devlink param
devlink: Add new "enable_roce" generic device param
net/mlx5: fix spelling mistake "metdata" -> "metadata"
net/mlx5: fix kvfree of uninitialized pointer spec
IB/mlx5: Introduce and use mlx5_core_is_vf()
net/mlx5: E-switch, Enable metadata on own vport
net/mlx5: Refactor ingress acl configuration
...
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
In commit a79ca656b6 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.
As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.
v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.
v4: Combine timeline->lock and engine pm release; it's hairy.
Fixes: a79ca656b6 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-2-chris@chris-wilson.co.uk
(cherry picked from commit 5cba288466)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>