After mm.h was removed from the asm-generic version of cacheflush.h,
s390 allyesconfig shows several warnings of the following nature:
In file included from arch/s390/include/generated/asm/cacheflush.h:1,
from drivers/media/platform/omap3isp/isp.c:42:
include/asm-generic/cacheflush.h:16:42: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration
As Geert and Laurent point out, this driver does not need this header in
the two files that include it. Remove it so there are no warnings.
Link: http://lkml.kernel.org/r/20200622234740.72825-2-natechancellor@gmail.com
Fixes: e0cf615d72 ("asm-generic: don't include <linux/mm.h> in cacheflush.h")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some s390 builds get these warnings:
include/asm-generic/cacheflush.h:16:42: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:22:46: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:28:45: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:36:44: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:44:45: warning: 'struct page' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:52:50: warning: 'struct address_space' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:58:52: warning: 'struct address_space' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:75:17: warning: 'struct page' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:74:45: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:82:16: warning: 'struct page' declared inside parameter list will not be visible outside of this definition or declaration
include/asm-generic/cacheflush.h:81:50: warning: 'struct vm_area_struct' declared inside parameter list will not be visible outside of this definition or declaration
Forward declare the named structs to get rid of these.
Link: http://lkml.kernel.org/r/20200623135714.4dae4b8a@canb.auug.org.au
Fixes: e0cf615d72 ("asm-generic: don't include <linux/mm.h> in cacheflush.h")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit 9e343b467c ("READ_ONCE: Enforce atomicity for
{READ,WRITE}_ONCE() memory accesses"), READ_ONCE() cannot be used
anymore to read complex page table entries.
This leads to:
CC mm/debug_vm_pgtable.o
In file included from ./include/asm-generic/bug.h:5,
from ./arch/powerpc/include/asm/bug.h:109,
from ./include/linux/bug.h:5,
from ./include/linux/mmdebug.h:5,
from ./include/linux/gfp.h:5,
from mm/debug_vm_pgtable.c:13:
In function 'pte_clear_tests',
inlined from 'debug_vm_pgtable' at mm/debug_vm_pgtable.c:363:2:
./include/linux/compiler.h:392:38: error: Unsupported access size for {READ,WRITE}_ONCE().
mm/debug_vm_pgtable.c:249:14: note: in expansion of macro 'READ_ONCE'
249 | pte_t pte = READ_ONCE(*ptep);
| ^~~~~~~~~
make[2]: *** [mm/debug_vm_pgtable.o] Error 1
Fix it by using the recently added ptep_get() helper.
Link: http://lkml.kernel.org/r/6ca8c972e6c920dc4ae0d4affbed9703afa4d010.1592490570.git.christophe.leroy@csgroup.eu
Fixes: 9e343b467c ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Murphy reports that a slightly overcommitted load, testing swap
and zram along with i915, splats and keeps on splatting, when it had
better fail less noisily:
gnome-shell: page allocation failure: order:0,
mode:0x400d0(__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_RECLAIMABLE),
nodemask=(null),cpuset=/,mems_allowed=0
CPU: 2 PID: 1155 Comm: gnome-shell Not tainted 5.7.0-1.fc33.x86_64 #1
Call Trace:
dump_stack+0x64/0x88
warn_alloc.cold+0x75/0xd9
__alloc_pages_slowpath.constprop.0+0xcfa/0xd30
__alloc_pages_nodemask+0x2df/0x320
alloc_slab_page+0x195/0x310
allocate_slab+0x3c5/0x440
___slab_alloc+0x40c/0x5f0
__slab_alloc+0x1c/0x30
kmem_cache_alloc+0x20e/0x220
xas_nomem+0x28/0x70
add_to_swap_cache+0x321/0x400
__read_swap_cache_async+0x105/0x240
swap_cluster_readahead+0x22c/0x2e0
shmem_swapin+0x8e/0xc0
shmem_swapin_page+0x196/0x740
shmem_getpage_gfp+0x3a2/0xa60
shmem_read_mapping_page_gfp+0x32/0x60
shmem_get_pages+0x155/0x5e0 [i915]
__i915_gem_object_get_pages+0x68/0xa0 [i915]
i915_vma_pin+0x3fe/0x6c0 [i915]
eb_add_vma+0x10b/0x2c0 [i915]
i915_gem_do_execbuffer+0x704/0x3430 [i915]
i915_gem_execbuffer2_ioctl+0x1ea/0x3e0 [i915]
drm_ioctl_kernel+0x86/0xd0 [drm]
drm_ioctl+0x206/0x390 [drm]
ksys_ioctl+0x82/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x5b/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported on 5.7, but it goes back really to 3.1: when
shmem_read_mapping_page_gfp() was implemented for use by i915, and
allowed for __GFP_NORETRY and __GFP_NOWARN flags in most places, but
missed swapin's "& GFP_KERNEL" mask for page tree node allocation in
__read_swap_cache_async() - that was to mask off HIGHUSER_MOVABLE bits
from what page cache uses, but GFP_RECLAIM_MASK is now what's needed.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208085
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2006151330070.11064@eggly.anvils
Fixes: 68da9f0557 ("tmpfs: pass gfp to shmem_getpage_gfp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Chris Murphy <lists@colorremedies.com>
Analyzed-by: Vlastimil Babka <vbabka@suse.cz>
Analyzed-by: Matthew Wilcox <willy@infradead.org>
Tested-by: Chris Murphy <lists@colorremedies.com>
Cc: <stable@vger.kernel.org> [3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It was found that running the LTP test on a PowerPC system could produce
erroneous values in /proc/meminfo, like:
MemTotal: 531915072 kB
MemFree: 507962176 kB
MemAvailable: 1100020596352 kB
Using bisection, the problem is tracked down to commit 9c315e4d7d ("mm:
memcg/slab: cache page number in memcg_(un)charge_slab()").
In memcg_uncharge_slab() with a "int order" argument:
unsigned int nr_pages = 1 << order;
:
mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages);
The mod_lruvec_state() function will eventually call the
__mod_zone_page_state() which accepts a long argument. Depending on the
compiler and how inlining is done, "-nr_pages" may be treated as a
negative number or a very large positive number. Apparently, it was
treated as a large positive number in that PowerPC system leading to
incorrect stat counts. This problem hasn't been seen in x86-64 yet,
perhaps the gcc compiler there has some slight difference in behavior.
It is fixed by making nr_pages a signed value. For consistency, a similar
change is applied to memcg_charge_slab() as well.
Link: http://lkml.kernel.org/r/20200620184719.10994-1-longman@redhat.com
Fixes: 9c315e4d7d ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()").
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signature verification is an important security feature, to protect
system from being attacked with a kernel of unknown origin. Kexec
rebooting is a way to replace the running kernel, hence need be secured
carefully.
In the current code of handling signature verification of kexec kernel,
the logic is very twisted. It mixes signature verification, IMA
signature appraising and kexec lockdown.
If there is no KEXEC_SIG_FORCE, kexec kernel image doesn't have one of
signature, the supported crypto, and key, we don't think this is wrong,
Unless kexec lockdown is executed. IMA is considered as another kind of
signature appraising method.
If kexec kernel image has signature/crypto/key, it has to go through the
signature verification and pass. Otherwise it's seen as verification
failure, and won't be loaded.
Seems kexec kernel image with an unqualified signature is even worse
than those w/o signature at all, this sounds very unreasonable. E.g.
If people get a unsigned kernel to load, or a kernel signed with expired
key, which one is more dangerous?
So, here, let's simplify the logic to improve code readability. If the
KEXEC_SIG_FORCE enabled or kexec lockdown enabled, signature
verification is mandated. Otherwise, we lift the bar for any kernel
image.
Link: http://lkml.kernel.org/r/20200602045952.27487-1-lijiang@redhat.com
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Reviewed-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Matthew Garrett <mjg59@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh reports:
"While stressing compaction, one run oopsed on NULL capc->cc in
__free_one_page()'s task_capc(zone): compact_zone_order() had been
interrupted, and a page was being freed in the return from interrupt.
Though you would not expect it from the source, both gccs I was using
(4.8.1 and 7.5.0) had chosen to compile compact_zone_order() with the
".cc = &cc" implemented by mov %rbx,-0xb0(%rbp) immediately before
callq compact_zone - long after the "current->capture_control =
&capc". An interrupt in between those finds capc->cc NULL (zeroed by
an earlier rep stos).
This could presumably be fixed by a barrier() before setting
current->capture_control in compact_zone_order(); but would also need
more care on return from compact_zone(), in order not to risk leaking
a page captured by interrupt just before capture_control is reset.
Maybe that is the preferable fix, but I felt safer for task_capc() to
exclude the rather surprising possibility of capture at interrupt
time"
I have checked that gcc10 also behaves the same.
The advantage of fix in compact_zone_order() is that we don't add
another test in the page freeing hot path, and that it might prevent
future problems if we stop exposing pointers to uninitialized structures
in current task.
So this patch implements the suggestion for compact_zone_order() with
barrier() (and WRITE_ONCE() to prevent store tearing) for setting
current->capture_control, and prevents page leaking with
WRITE_ONCE/READ_ONCE in the proper order.
Link: http://lkml.kernel.org/r/20200616082649.27173-1-vbabka@suse.cz
Fixes: 5e1f0f098b ("mm, compaction: capture a page under direct compaction")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Hugh Dickins <hughd@google.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Li Wang <liwang@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [5.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some occasions task held spinlock (mac_filter_hash_lock),
while being rescheduled due to admin queue mutex_lock. The struct
i40e_spinlock asq_spinlock, which later expands to struct mutex
spinlock. Moved i40e_aq_set_vsi_multicast_promiscuous(),
i40e_aq_set_vsi_unicast_promiscuous(),
i40e_aq_set_vsi_mc_promisc_on_vlan(), and
i40e_aq_set_vsi_uc_promisc_on_vlan() outside of atomic context. Without
this patch there is a race condition, which might result in scheduling
while in atomic context. The race condition is between the thread, which
holds mac_filter_hash_lock, while trying to acquire an admin queue mutex
and a thread, which already has said admin queue mutex. The thread, which
holds spinlock, fails to acquire the mutex, which causes this thread to
sleep.
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
As with other networking drivers, remove the unnecessary driver version
from the Intel drivers. The ethtool driver information and module version
will then report the kernel version instead.
For ixgbe, i40e and ice drivers, the driver passes the driver version to
the firmware to confirm that we are up and running. So we now pass the
value of UTS_RELEASE to the firmware. This adminq call is required per
the HAS document. The Device then sends an indication to the BMC that the
PF driver is present. This is done using Host NC Driver Status Indication
in NC-SI Get Link command or via the Host Network Controller Driver Status
Change AEN.
What the BMC may do with this information is implementation-dependent, but
this is a standard NC-SI 1.1 command we honor per the HAS.
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Alek Loktionov <aleksandr.loktionov@intel.com>
CC: Kevin Liedtke <kevin.d.liedtke@intel.com>
CC: Aaron Rowden <aaron.f.rowden@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
A customer was chain-booting to provision his systems and one of the
steps was setting MFS. MFS isn't cleared by normal warm reboots
(clearing requires a GLOBR) and there was no indication of why Jumbo
Frame receives were failing.
Add a warning if MFS is set to anything lower than the default.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Detect and log information about pre-recovery mode when firmware
transitions to a recovery mode.
When a firmware transitions to a recovery mode it stores a number
of unexpected EMP resets in one of its registers. The number of EMP
resets ranging from 0x21 to 0x2A indicates that FW transitions
to recovery mode. Use these values to emit log entry about transition
process. Previously the pre-recovery mode may not have been detected
and there was no log entry when NIC was in pre-recovery mode.
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use jiffies to limit max waiting time for PF reset to succeed.
Previous wait loop was unreliable. It required unreasonably long time
to wait for PF reset after reboot when NIC was about to enter
recovery mode
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Short summary of fixes pull (less than what git shortlog provides):
* In mcde, set up fbdev after device registration and removde the last access
to dev->dev_private. Fixes an error message and a segmentation fault.
* Set the connector type for LogicPT Type 28 and newhaven_nhd_43_480272ef_atxl
panels.
* In uvesafb, fix the handling of the noblank option.
* Fix panel orientation for Asus T101HA and Acer S1003.
* Fix DMA configuration for sun4i if IOMMU is present.
* Fix regression in VT restoration. Unbreaks userspace (i.e., Xorg) VT handling.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200625082717.GA14856@linux-uq9g
We use OUTPUT directory as TMPOUT for checking no-pie option.
Since commit f2f02ebd8f ("kbuild: improve cc-option to clean up all
temporary files") when building powerpc/ from selftests directory, the
OUTPUT directory points to powerpc/pmu/ebb/ and gets removed when
checking for -no-pie option in try-run routine, subsequently build
fails with the following:
$ make -C powerpc
...
TARGET=ebb; BUILD_TARGET=$OUTPUT/$TARGET; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C $TARGET all
make[2]: Entering directory '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb'
make[2]: *** No rule to make target 'Makefile'.
make[2]: Failed to remake makefile 'Makefile'.
make[2]: *** No rule to make target 'ebb.c', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'ebb_handler.S', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'trace.c', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: *** No rule to make target 'busy_loop.S', needed by '/home/linux-master/tools/testing/selftests/powerpc/pmu/ebb/reg_access_test'.
make[2]: Target 'all' not remade because of errors.
Fix this by adding a suffix to the OUTPUT directory so that the
failure is avoided.
Fixes: 9686813f6e ("selftests/powerpc: Fix try-run when source tree is not writable")
Signed-off-by: Harish <harish@linux.ibm.com>
[mpe: Mention that commit that triggered the breakage]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200625165721.264904-1-harish@linux.ibm.com
Minor overlapping changes in xfrm_device.c, between the double
ESP trailing bug fix setting the XFRM_INIT flag and the changes
in net-next preparing for bonding encryption support.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Don't insert ESP trailer twice in IPSEC code, from Huy Nguyen.
2) The default crypto algorithm selection in Kconfig for IPSEC is out
of touch with modern reality, fix this up. From Eric Biggers.
3) bpftool is missing an entry for BPF_MAP_TYPE_RINGBUF, from Andrii
Nakryiko.
4) Missing init of ->frame_sz in xdp_convert_zc_to_xdp_frame(), from
Hangbin Liu.
5) Adjust packet alignment handling in ax88179_178a driver to match
what the hardware actually does. From Jeremy Kerr.
6) register_netdevice can leak in the case one of the notifiers fail,
from Yang Yingliang.
7) Use after free in ip_tunnel_lookup(), from Taehee Yoo.
8) VLAN checks in sja1105 DSA driver need adjustments, from Vladimir
Oltean.
9) tg3 driver can sleep forever when we get enough EEH errors, fix from
David Christensen.
10) Missing {READ,WRITE}_ONCE() annotations in various Intel ethernet
drivers, from Ciara Loftus.
11) Fix scanning loop break condition in of_mdiobus_register(), from
Florian Fainelli.
12) MTU limit is incorrect in ibmveth driver, from Thomas Falcon.
13) Endianness fix in mlxsw, from Ido Schimmel.
14) Use after free in smsc95xx usbnet driver, from Tuomas Tynkkynen.
15) Missing bridge mrp configuration validation, from Horatiu Vultur.
16) Fix circular netns references in wireguard, from Jason A. Donenfeld.
17) PTP initialization on recovery is not done properly in qed driver,
from Alexander Lobakin.
18) Endian conversion of L4 ports in filters of cxgb4 driver is wrong,
from Rahul Lakkireddy.
19) Don't clear bound device TX queue of socket prematurely otherwise we
get problems with ktls hw offloading, from Tariq Toukan.
20) ipset can do atomics on unaligned memory, fix from Russell King.
21) Align ethernet addresses properly in bridging code, from Thomas
Martitz.
22) Don't advertise ipv4 addresses on SCTP sockets having ipv6only set,
from Marcelo Ricardo Leitner.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (149 commits)
rds: transport module should be auto loaded when transport is set
sch_cake: fix a few style nits
sch_cake: don't call diffserv parsing code when it is not needed
sch_cake: don't try to reallocate or unshare skb unconditionally
ethtool: fix error handling in linkstate_prepare_data()
wil6210: account for napi_gro_receive never returning GRO_DROP
hns: do not cast return value of napi_gro_receive to null
socionext: account for napi_gro_receive never returning GRO_DROP
wireguard: receive: account for napi_gro_receive never returning GRO_DROP
vxlan: fix last fdb index during dump of fdb with nhid
sctp: Don't advertise IPv4 addresses if ipv6only is set on the socket
tc-testing: avoid action cookies with odd length.
bpf: tcp: bpf_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT
net: dsa: sja1105: fix tc-gate schedule with single element
net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules
net: dsa: sja1105: unconditionally free old gating config
net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top
net: macb: free resources on failure path of at91ether_open()
net: macb: call pm_runtime_put_sync on failure path
...
Change tin mapping on diffserv3, 4 & 8 for LE PHB support, in essence
making LE a member of the Bulk tin.
Bulk has the least priority and minimum of 1/16th total bandwidth in the
face of higher priority traffic.
NB: Diffserv 3 & 4 swap tin 0 & 1 priorities from the default order as
found in diffserv8, in case anyone is wondering why it looks a bit odd.
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
[ reword commit message slightly ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toke Høiland-Jørgensen says:
====================
sched: A couple of fixes for sch_cake
This series contains a couple of fixes for diffserv handling in sch_cake that
provide a nice speedup (with a somewhat pedantic nit fix tacked on to the end).
Not quite sure about whether this should go to stable; it does provide a nice
speedup, but it's not strictly a fix in the "correctness" sense. I lean towards
including this in stable as well, since our most important consumer of that
(OpenWrt) is likely to backport the series anyway.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
I spotted a few nits when comparing the in-tree version of sch_cake with
the out-of-tree one: A redundant error variable declaration shadowing an
outer declaration, and an indentation alignment issue. Fix both of these.
Fixes: 046f6fd5da ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As a further optimisation of the diffserv parsing codepath, we can skip it
entirely if CAKE is configured to neither use diffserv-based
classification, nor to zero out the diffserv bits.
Fixes: c87b4ecdbe ("sch_cake: Make sure we can write the IP header before changing DSCP bits")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cake_handle_diffserv() tries to linearize mac and network header parts of
skb and to make it writable unconditionally. In some cases it leads to full
skb reallocation, which reduces throughput and increases CPU load. Some
measurements of IPv4 forward + NAPT on MIPS router with 580 MHz single-core
CPU was conducted. It appears that on kernel 4.9 skb_try_make_writable()
reallocates skb, if skb was allocated in ethernet driver via so-called
'build skb' method from page cache (it was discovered by strange increase
of kmalloc-2048 slab at first).
Obtain DSCP value via read-only skb_header_pointer() call, and leave
linearization only for DSCP bleaching or ECN CE setting. And, as an
additional optimisation, skip diffserv parsing entirely if it is not needed
by the current configuration.
Fixes: c87b4ecdbe ("sch_cake: Make sure we can write the IP header before changing DSCP bits")
Signed-off-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
[ fix a few style issues, reflow commit message ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart says:
====================
net: phy: mscc: multiple improvements
This series contains various improvements to the MSCC PHY driver, fixing
sparse and smatch warnings, using functions provided by the PHY core,
and improving the driver consistency and maintenance.
I don't think any of those improvements and fixes is worth backporting
to stable trees.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
All PHY read and write return values are checked for errors in
vsc8514_config_init and vsc8584_config_init, except for one. Fix this.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the middle of vsc8584_config_init and vsc8514_config_init, the page
is set to 'standard'. This is the default value, and the page isn't set
to another value before. Those pages configuration can be safely
removed.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the vsc8584_config_init and vsc8514_config_init, the base page is set
to 'GPIO', configuration is done, and the page is never explicitly
restored to the standard page. No bug was triggered as it turns out
helpers called in those config_init functions do modify the base page,
and set it back to standard. But that is dangerous and any modification
to those functions would introduce bugs. This patch fixes this, to
improve maintenance, by restoring the base page to 'standard' once
'GPIO' accesses are completed.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch improves the MSCC driver by using the provided
phy_lock_mdio_bus and phy_unlock_mdio_bus helpers instead of locking and
unlocking the MDIO bus lock directly. The patch is only cosmetic but
should improve maintenance and consistency.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a typo in a comment, s/Ths/This/. The patch is cosmetic
only.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following error was reported by smatch:
vsc85xx_ts_read_csr() error: uninitialized symbol 'blk_hw'.
In practice this is very unlikely, as all the block identifiers given to
this functions are handled and described in an enum. The smatch error is
fixed by doing what is already done in vsc85xx_ts_write_csr: using the
"PROCESSOR" block by default.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On vsc8584_ptp_init failure we jump to the 'err' label, which unlocks
the MDIO bus lock. But vsc8584_ptp_init isn't called with the MDIO bus
lock taken, which could result in a double unlock. Fix this.
Fixes: ab2bf93393 ("net: phy: mscc: 1588 block initialization")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following sparse warnings when building MACsec
support in the MSCC PHY driver.
mscc_macsec.c:393:42: warning: cast from restricted sci_t
mscc_macsec.c:395:42: warning: restricted sci_t degrades to integer
mscc_macsec.c:402:42: warning: restricted __be16 degrades to integer
mscc_macsec.c:608:34: warning: cast from restricted sci_t
mscc_macsec.c:610:34: warning: restricted sci_t degrades to integer
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
When getting SQI or maximum SQI value fails in linkstate_prepare_data(), we
must not return without calling ethnl_ops_complete(dev) as that could
result in imbalance between ethtool_ops ->begin() and ->complete() calls.
Fixes: 8066021915 ("ethtool: provide UAPI for PHY Signal Quality Index (SQI)")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull tracing fixes from Steven Rostedt:
"Four small fixes:
- Fix a ringbuffer bug for nested events having time go backwards
- Fix a config dependency for boot time tracing to depend on
synthetic events instead of histograms.
- Fix trigger format parsing to handle multiple spaces
- Fix bootconfig to handle failures in multiple events"
* tag 'trace-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/boottime: Fix kprobe multiple events
tracing: Fix event trigger to accept redundant spaces
tracing/boot: Fix config dependency for synthedic event
ring-buffer: Zero out time extend if it is nested and not absolute
Jason A. Donenfeld says:
====================
napi_gro_receive caller return value cleanups
In 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in
napi_gro_receive()"), the GRO_NORMAL case stopped calling
netif_receive_skb_internal, checking its return value, and returning
GRO_DROP in case it failed. Instead, it calls into
netif_receive_skb_list_internal (after a bit of indirection), which
doesn't return any error. Therefore, napi_gro_receive will never return
GRO_DROP, making handling GRO_DROP dead code.
I emailed the author of 6570bc79c0 on netdev [1] to see if this change
was intentional, but the dlink.ru email address has been disconnected,
and looking a bit further myself, it seems somewhat infeasible to start
propagating return values backwards from the internal machinations of
netif_receive_skb_list_internal.
Taking a look at all the callers of napi_gro_receive, it appears that
three are checking the return value for the purpose of comparing it to
the now never-happening GRO_DROP, and one just casts it to (void), a
likely historical leftover. Every other of the 120 callers does not
bother checking the return value.
And it seems like these remaining 116 callers are doing the right thing:
after calling napi_gro_receive, the packet is now in the hands of the
upper layers of the newtworking, and the device driver itself has no
business now making decisions based on what the upper layers choose to
do. Incrementing stats counters on GRO_DROP seems like a mistake, made
by these three drivers, but not by the remaining 117.
It would seem, therefore, that after rectifying these four callers of
napi_gro_receive, that I should go ahead and just remove returning the
value from napi_gro_receive all together. However, napi_gro_receive has
a function event tracer, and being able to introspect into the
networking stack to see how often napi_gro_receive is returning whatever
interesting GRO status (aside from _DROP) remains an interesting
data point worth keeping for debugging.
So, this series simply gets rid of the return value checking for the
four useless places where that check never evaluates to anything
meaningful.
[1] https://lore.kernel.org/netdev/20200624210606.GA1362687@zx2c4.com/
====================
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands. In this case, too, the non-gro path didn't bother checking
the return value. Plus, this had some clunky debugging functions that
duplicated code from elsewhere and was generally pretty messy. So, this
commit cleans that all up too.
Fixes: 6570bc79c0 ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Basically no drivers care about the return value here, and there's no
__must_check that would make casting to void sensible, so remove it.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>