android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Simon Horman	a1165b5919	net/sched: act_tunnel_key: disambiguate metadata dst error cases Metadata may be NULL for one of two reasons: * Missing user input * Failure to allocate the metadata dst Disambiguate these case by returning -EINVAL for the former and -ENOMEM for the latter rather than -EINVAL for both cases. This is in preparation for using extended ack to provide more information to users when parsing their input. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 23:50:26 +09:00
Christoph Hellwig	e88958e636	net: handle NULL ->poll gracefully The big aio poll revert broke various network protocols that don't implement ->poll as a patch in the aio poll serie removed sock_no_poll and made the common code handle this case. Reported-by: syzbot+57727883dbad76db2ef0@syzkaller.appspotmail.com Reported-by: syzbot+cdb0d3176b53d35ad454@syzkaller.appspotmail.com Reported-by: syzbot+2c7e8f74f8b2571c87e8@syzkaller.appspotmail.com Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Fixes: `a11e1d432b` ("Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-06-29 06:51:51 -07:00
Xin Long	b0e9a2fe3f	sctp: add support for SCTP_REUSE_PORT sockopt This feature is actually already supported by sk->sk_reuse which can be set by socket level opt SO_REUSEADDR. But it's not working exactly as RFC6458 demands in section 8.1.27, like: - This option only supports one-to-one style SCTP sockets - This socket option must not be used after calling bind() or sctp_bindx(). Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs. Otherwise, the programs with SCTP_REUSE_PORT from other systems will not work in linux. To separate it from the socket level version, this patch adds 'reuse' in sctp_sock and it works pretty much as sk->sk_reuse, but with some extra setup limitations that are needed when it is being enabled. "It should be noted that the behavior of the socket-level socket option to reuse ports and/or addresses for SCTP sockets is unspecified", so it leaves SO_REUSEADDR as is for the compatibility. Note that the name SCTP_REUSE_PORT is somewhat confusing, as its functionality is nearly identical to SO_REUSEADDR, but with some extra restrictions. Here it uses 'reuse' in sctp_sock instead of 'reuseport'. As for sk->sk_reuseport support for SCTP, it will be added in another patch. Thanks to Neil to make this clear. v1->v2: - add sctp_sk->reuse to separate it from the socket level version. v2->v3: - improve changelog according to Marcelo's suggestion. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 22:20:55 +09:00
David S. Miller	0933cc294f	Merge tag 'mac80211-for-davem-2018-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just three fixes: * fix HT operation in mesh mode * disable preemption in control frame TX * check nla_parse_nested() return values where missing (two places) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 22:09:26 +09:00
Shakeel Butt	e699e2c6a6	net, mm: account sock objects to kmemcg Currently the kernel accounts the memory for network traffic through mem_cgroup_[un]charge_skmem() interface. However the memory accounted only includes the truesize of sk_buff which does not include the size of sock objects. In our production environment, with opt-out kmem accounting, the sock kmem caches (TCP[v6], UDP[v6], RAW[v6], UNIX) are among the top most charged kmem caches and consume a significant amount of memory which can not be left as system overhead. So, this patch converts the kmem caches of all sock objects to SLAB_ACCOUNT. Signed-off-by: Shakeel Butt <shakeelb@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 21:56:27 +09:00
Omer Efrat	a421775058	mac80211: use BIT_ULL for NL80211_STA_INFO_* attribute types The BIT macro uses unsigned long which some architectures handle as 32 bit and therefore might cause macro's shift to overflow when used on a value equals or larger than 32 (NL80211_STA_INFO_RX_DURATION and afterwards). Since 'filled' member in station_info changed to u64, BIT_ULL macro should be used with all NL80211_STA_INFO_* attribute types instead of BIT to prevent future possible bugs when one will use BIT macro for higher attributes by mistake. This commit cleans up all usages of BIT macro with the above field in mac80211 by changing it to BIT_ULL instead. Signed-off-by: Omer Efrat <omer.efrat@tandemg.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:53:09 +02:00
Omer Efrat	397c657a06	cfg80211: use BIT_ULL for NL80211_STA_INFO_* attribute types The BIT macro uses unsigned long which some architectures handle as 32 bit and therefore might cause macro's shift to overflow when used on a value equals or larger than 32 (NL80211_STA_INFO_RX_DURATION and afterwards). Since 'filled' member in station_info changed to u64, BIT_ULL macro should be used with all NL80211_STA_INFO_* attribute types instead of BIT to prevent future possible bugs when one will use BIT macro for higher attributes by mistake. This commit cleans up all usages of BIT macro with the above field in cfg80211 by changing it to BIT_ULL instead. In addition, there are some places which don't use BIT nor BIT_ULL macros so align those as well. Signed-off-by: Omer Efrat <omer.efrat@tandemg.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:52:23 +02:00
Johannes Berg	f0c0407d2a	mac80211: remove unnecessary NULL check We don't need to check if he_oper is NULL before calling ieee80211_verify_sta_he_mcs_support() as it - now - will correctly check this itself. Remove the redundant check. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:51:39 +02:00
Gustavo A. R. Silva	47aa7861b9	mac80211: fix potential null pointer dereference he_op is being dereferenced before it is null checked, hence there is a potential null pointer dereference. Fix this by moving the pointer dereference after he_op has been properly null checked. Notice that, currently, he_op is already being null checked before calling this function at 4593: 4593 if (!he_oper \|\| 4594 !ieee80211_verify_sta_he_mcs_support(sband, he_oper)) 4595 ifmgd->flags \|= IEEE80211_STA_DISABLE_HE; but in case ieee80211_verify_sta_he_mcs_support is ever called without verifying he_oper is not null, we will end up having a null pointer dereference. So, we better don't take any chances. Addresses-Coverity-ID: 1470068 ("Dereference before null check") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:50:43 +02:00
Arnd Bergmann	fe0984d389	cfg80211: track time using boottime The cfg80211 layer uses get_seconds() to read the current time in its supend handling. This function is deprecated because of the 32-bit time_t overflow, and it can cause unexpected behavior when the time changes due to settimeofday() calls or leap second updates. In many cases, we want to use monotonic time instead, however cfg80211 explicitly tracks the time spent in suspend, so this changes the driver over to use ktime_get_boottime_seconds(), which is slightly slower, but not used in a fastpath here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:49:28 +02:00
Johannes Berg	95bca62fb7	nl80211: check nla_parse_nested() return values At the very least we should check the return value if nla_parse_nested() is called with a non-NULL policy. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:44:51 +02:00
Bob Copeland	188f60ab8e	nl80211: relax ht operation checks for mesh Commit `9757235f45`, "nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value") relaxed the range for the HT operation field in meshconf, while also adding checks requiring the non-greenfield and non-ht-sta bits to be set in certain circumstances. The latter bit is actually reserved for mesh BSSes according to Table 9-168 in 802.11-2016, so in fact it should not be set. wpa_supplicant sets these bits because the mesh and AP code share the same implementation, but authsae does not. As a result, some meshconf updates from authsae which set only the NONHT_MIXED protection bits were being rejected. In order to avoid breaking userspace by changing the rules again, simply accept the values with or without the bits set, and mask off the reserved bit to match the spec. While in here, update the 802.11-2012 reference to 802.11-2016. Fixes: `9757235f45` ("nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value") Cc: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Bob Copeland <bobcopeland@fb.com> Reviewed-by: Masashi Honma <masashi.honma@gmail.com> Reviewed-by: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:39:30 +02:00
Denis Kenzior	e7441c9274	mac80211: disable BHs/preemption in ieee80211_tx_control_port() On pre-emption enabled kernels the following print was being seen due to missing local_bh_disable/local_bh_enable calls. mac80211 assumes that pre-emption is disabled in the data path. BUG: using smp_processor_id() in preemptible [00000000] code: iwd/517 caller is __ieee80211_subif_start_xmit+0x144/0x210 [mac80211] [...] Call Trace: dump_stack+0x5c/0x80 check_preemption_disabled.cold.0+0x46/0x51 __ieee80211_subif_start_xmit+0x144/0x210 [mac80211] Fixes: `9118064914` ("mac80211: Add support for tx_control_port") Signed-off-by: Denis Kenzior <denkenz@gmail.com> [commit message rewrite, fixes tag] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2018-06-29 09:39:08 +02:00
Tom Herbert	b6e71bdebb	ila: Flush netlink command to clear xlat table Add ILA_CMD_FLUSH netlink command to clear the ILA translation table. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
Tom Herbert	ad68147ef2	ila: Create main ila source file Create a main ila file that contains the module initialization functions as well as netlink definitions. Previously these were defined in ila_xlat and ila_common. This approach allows better extensibility. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
Tom Herbert	b893281715	ila: Call library function alloc_bucket_locks To allocate the array of bucket locks for the hash table we now call library function alloc_bucket_spinlocks. Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
Tom Herbert	f7a2ba5ab9	ila: Fix use of rhashtable walk in ila_xlat.c Perform better EAGAIN handling, handle case where ila_dump_info fails and we missed objects in the dump, and add a skip index to skip over ila entires in a list on a rhashtable node that have already been visited (by a previous call to ila_nl_dump). Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-29 11:32:55 +09:00
David Ahern	4c79579b44	bpf: Change bpf_fib_lookup to return lookup status For ACLs implemented using either FIB rules or FIB entries, the BPF program needs the FIB lookup status to be able to drop the packet. Since the bpf_fib_lookup API has not reached a released kernel yet, change the return code to contain an encoding of the FIB lookup result and return the nexthop device index in the params struct. In addition, inform the BPF program of any post FIB lookup reason as to why the packet needs to go up the stack. The fib result for unicast routes must have an egress device, so remove the check that it is non-NULL. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-06-29 00:02:02 +02:00
Linus Torvalds	a11e1d432b	Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls. Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections. But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign. [ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-06-28 10:40:47 -07:00
Flavio Leitner	9c4c325252	skbuff: preserve sock reference when scrubbing the skb. The sock reference is lost when scrubbing the packet and that breaks TSQ (TCP Small Queues) and XPS (Transmit Packet Steering) causing performance impacts of about 50% in a single TCP stream when crossing network namespaces. XPS breaks because the queue mapping stored in the socket is not available, so another random queue might be selected when the stack needs to transmit something like a TCP ACK, or TCP Retransmissions. That causes packet re-ordering and/or performance issues. TSQ breaks because it orphans the packet while it is still in the host, so packets are queued contributing to the buffer bloat problem. Preserving the sock reference fixes both issues. The socket is orphaned anyways in the receiving path before any relevant action and on TX side the netfilter checks if the reference is local before use it. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:21:32 +09:00
Flavio Leitner	f564650106	netfilter: check if the socket netns is correct. Netfilter assumes that if the socket is present in the skb, then it can be used because that reference is cleaned up while the skb is crossing netns. We want to change that to preserve the socket reference in a future patch, so this is a preparation updating netfilter to check if the socket netns matches before use it. Signed-off-by: Flavio Leitner <fbl@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:21:32 +09:00
Roman Mashak	4305274153	net sched actions: avoid bitwise operation on signed value in pedit Since char can be unsigned or signed, and bitwise operators may have implementation-dependent results when performed on signed operands, declare 'u8 *' operand instead. Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:12:03 +09:00
Roman Mashak	95b0d2dc13	net sched actions: fix misleading text strings in pedit action Change "tc filter pedit .." to "tc actions pedit .." in error messages to clearly refer to pedit action. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:12:03 +09:00
Roman Mashak	6ff7586e38	net sched actions: use sizeof operator for buffer length Replace constant integer with sizeof() to clearly indicate the destination buffer length in skb_header_pointer() calls. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:12:03 +09:00
Roman Mashak	544377cd25	net sched actions: fix sparse warning The variable _data in include/asm-generic/sections.h defines sections, this causes sparse warning in pedit: net/sched/act_pedit.c:293:35: warning: symbol '_data' shadows an earlier one ./include/asm-generic/sections.h:36:13: originally declared here Therefore rename the variable. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:12:03 +09:00
Roman Mashak	80f0f574cc	net sched actions: fix coding style in pedit action Fix coding style issues in tc pedit action detected by the checkpatch script. Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:12:03 +09:00
Yousuk Seung	0a9fe5c375	netem: slotting with non-uniform distribution Extend slotting with support for non-uniform distributions. This is similar to netem's non-uniform distribution delay feature. Commit f043efeae2f1 ("netem: support delivering packets in delayed time slots") added the slotting feature to approximate the behaviors of media with packet aggregation but only supported a uniform distribution for delays between transmission attempts. Tests with TCP BBR with emulated wifi links with non-uniform distributions produced more useful results. Syntax: slot dist DISTRIBUTION DELAY JITTER [packets MAX_PACKETS] \ [bytes MAX_BYTES] The syntax and use of the distribution table is the same as in the non-uniform distribution delay feature. A file DISTRIBUTION must be present in TC_LIB_DIR (e.g. /usr/lib/tc) containing numbers scaled by NETEM_DIST_SCALE. A random value x is selected from the table and it takes DELAY + ( x * JITTER ) as delay. Correlation between values is not supported. Examples: Normal distribution delay with mean = 800us and stdev = 100us. > tc qdisc add dev eth0 root netem slot dist normal 800us 100us Optionally set the max slot size in bytes and/or packets. > tc qdisc add dev eth0 root netem slot dist normal 800us 100us \ bytes 64k packets 42 Signed-off-by: Yousuk Seung <ysseung@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:06:24 +09:00
Ursula Braun	24ac3a08e6	net/smc: rebuild nonblocking connect The recent poll change may lead to stalls for non-blocking connecting SMC sockets, since sock_poll_wait is no longer performed on the internal CLC socket, but on the outer SMC socket. kernel_connect() on the internal CLC socket returns with -EINPROGRESS, but the wake up logic does not work in all cases. If the internal CLC socket is still in state TCP_SYN_SENT when polled, sock_poll_wait() from sock_poll() does not sleep. It is supposed to sleep till the state of the internal CLC socket switches to TCP_ESTABLISHED. This problem triggered a redesign of the SMC nonblocking connect logic. This patch introduces a connect worker covering all connect steps followed by a wake up of socket waiters. It allows to get rid of all delays and locks in smc_poll(). Fixes: `c0129a0614` ("smc: convert to ->poll_mask") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:03:55 +09:00
Eric Dumazet	15ecbe94a4	tcp: add one more quick ack after after ECN events Larry Brakmo proposal ( https://patchwork.ozlabs.org/patch/935233/ tcp: force cwnd at least 2 in tcp_cwnd_reduction) made us rethink about our recent patch removing ~16 quick acks after ECN events. tcp_enter_quickack_mode(sk, 1) makes sure one immediate ack is sent, but in the case the sender cwnd was lowered to 1, we do not want to have a delayed ack for the next packet we will receive. Fixes: `522040ea5f` ("tcp: do not aggressively quick ack after ECN events") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Neal Cardwell <ncardwell@google.com> Cc: Lawrence Brakmo <brakmo@fb.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 22:01:04 +09:00
Masahiro Yamada	8e75887d32	bpfilter: include bpfilter_umh in assembly instead of using objcopy What we want here is to embed a user-space program into the kernel. Instead of the complex ELF magic, let's simply wrap it in the assembly with the '.incbin' directive. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 21:39:16 +09:00
Doron Roberts-Kedes	977c7114eb	strparser: Remove early eaten to fix full tcp receive buffer stall On receving an incomplete message, the existing code stores the remaining length of the cloned skb in the early_eaten field instead of incrementing the value returned by __strp_recv. This defers invocation of sock_rfree for the current skb until the next invocation of __strp_recv, which returns early_eaten if early_eaten is non-zero. This behavior causes a stall when the current message occupies the very tail end of a massive skb, and strp_peek/need_bytes indicates that the remainder of the current message has yet to arrive on the socket. The TCP receive buffer is totally full, causing the TCP window to go to zero, so the remainder of the message will never arrive. Incrementing the value returned by __strp_recv by the amount otherwise stored in early_eaten prevents stalls of this nature. Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 21:37:26 +09:00
Guillaume Nault	a408194aa0	l2tp: define helper for parsing struct sockaddr_pppol2tp* 'sockaddr_len' is checked against various values when entering pppol2tp_connect(), to verify its validity. It is used again later, to find out which sockaddr structure was passed from user space. This patch combines these two operations into one new function in order to simplify pppol2tp_connect(). A new structure, l2tp_connect_info, is used to pass sockaddr data back to pppol2tp_connect(), to avoid passing too many parameters to l2tp_sockaddr_get_info(). Also, the first parameter is void* in order to avoid casting between all sockaddr_* structures manually. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 16:06:50 +09:00
Eric Dumazet	242b1bbe51	tcp: remove one indentation level in tcp_create_openreq_child Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 16:02:31 +09:00
Masahiro Yamada	88e85a7daf	bpfilter: check compiler capability in Kconfig With the brand-new syntax extension of Kconfig, we can directly check the compiler capability in the configuration phase. If the cc-can-link.sh fails, the BPFILTER_UMH is automatically hidden by the dependency. I also deleted 'default n', which is no-op. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 13:36:39 +09:00
David S. Miller	0901441839	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree: 1) Missing netlink attribute validation in nf_queue, uncovered by KASAN, from Eric Dumazet. 2) Use pointer to sysctl table, save us 192 bytes of memory per netns. Also from Eric. 3) Possible use-after-free when removing conntrack helper modules due to missing synchronize RCU call. From Taehee Yoo. 4) Fix corner case in systcl writes to nf_log that lead to appending data to uninitialized buffer, from Jann Horn. 5) Jann Horn says we may indefinitely block other users of nf_log_mutex if a userspace access in proc_dostring() blocked e.g. due to a userfaultfd. 6) Fix garbage collection race for unconfirmed conntrack entries, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 13:32:44 +09:00
Zhen Lei	7284fdf39a	esp6: fix memleak on error path in esp6_input This ought to be an omission in `e619492323` ("esp: Fix memleaks on error paths."). The memleak on error path in esp6_input is similar to esp_input of esp4. Fixes: `e619492323` ("esp: Fix memleaks on error paths.") Fixes: `3f29770723` ("ipsec: check return value of skb_to_sgvec always") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2018-06-27 17:32:11 +02:00
Roopa Prabhu	8e326289e3	neighbour: force neigh_invalidate when NUD_FAILED update is from admin In systems where neigh gc thresh holds are set to high values, admin deleted neigh entries (eg ip neigh flush or ip neigh del) can linger around in NUD_FAILED state for a long time until periodic gc kicks in. This patch forces neigh_invalidate when NUD_FAILED neigh_update is from an admin. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-27 15:40:45 +09:00
Kees Cook	3463e51dc3	net/tls: Remove VLA usage on nonce It looks like the prior VLA removal, commit `b16520f749` ("net/tls: Remove VLA usage"), and a new VLA addition, commit `c46234ebb4` ("tls: RX path for ktls"), passed in the night. This removes the newly added VLA, which happens to have its bounds based on the same max value. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-27 10:39:52 +09:00
Jason A. Donenfeld	7c8f4e6dc3	fib_rules: match rules based on suppress_* properties too Two rules with different values of suppress_prefix or suppress_ifgroup are not the same. This fixes an -EEXIST when running: $ ip -4 rule add table main suppress_prefixlength 0 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Fixes: `f9d4b0c1e9` ("fib_rules: move common handling of newrule delrule msgs into fib_nl2rule") Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-27 10:33:05 +09:00
Sowmini Varadhan	c809195f55	rds: clean up loopback rds_connections on netns deletion The RDS core module creates rds_connections based on callbacks from rds_loop_transport when sending/receiving packets to local addresses. These connections will need to be cleaned up when they are created from a netns that is not init_net, and that netns is deleted. Add the changes aligned with the changes from commit `ebeeb1ad9b` ("rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and rds connection/workq management") for rds_loop_transport Reported-and-tested-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-27 10:11:03 +09:00
Florian Westphal	b36e4523d4	netfilter: nf_conncount: fix garbage collection confirm race Yi-Hung Wei and Justin Pettit found a race in the garbage collection scheme used by nf_conncount. When doing list walk, we lookup the tuple in the conntrack table. If the lookup fails we remove this tuple from our list because the conntrack entry is gone. This is the common cause, but turns out its not the only one. The list entry could have been created just before by another cpu, i.e. the conntrack entry might not yet have been inserted into the global hash. The avoid this, we introduce a timestamp and the owning cpu. If the entry appears to be stale, evict only if: 1. The current cpu is the one that added the entry, or, 2. The timestamp is older than two jiffies The second constraint allows GC to be taken over by other cpu too (e.g. because a cpu was offlined or napi got moved to another cpu). We can't pretend the 'doubtful' entry wasn't in our list. Instead, when we don't find an entry indicate via IS_ERR that entry was removed ('did not exist' or withheld ('might-be-unconfirmed'). This most likely also fixes a xt_connlimit imbalance earlier reported by Dmitry Andrianov. Cc: Dmitry Andrianov <dmitry.andrianov@alertme.com> Reported-by: Justin Pettit <jpettit@vmware.com> Reported-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-06-26 18:28:57 +02:00
Jann Horn	ce00bf07cc	netfilter: nf_log: don't hold nf_log_mutex during user access The old code would indefinitely block other users of nf_log_mutex if a userspace access in proc_dostring() blocked e.g. due to a userfaultfd region. Fix it by moving proc_dostring() out of the locked region. This is a followup to commit `266d07cb1c` ("netfilter: nf_log: fix sleeping function called from invalid context"), which changed this code from using rcu_read_lock() to taking nf_log_mutex. Fixes: `266d07cb1c` ("netfilter: nf_log: fix sleeping function calle[...]") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-06-26 16:48:40 +02:00
Jann Horn	dffd22aed2	netfilter: nf_log: fix uninit read in nf_log_proc_dostring When proc_dostring() is called with a non-zero offset in strict mode, it doesn't just write to the ->data buffer, it also reads. Make sure it doesn't read uninitialized data. Fixes: `c6ac37d8d8` ("netfilter: nf_log: fix error on write NONE to [...]") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-06-26 16:48:23 +02:00
John Hurley	326367427c	net: sched: call reoffload op on block callback reg Call the reoffload tcf_proto_op on all tcf_proto nodes in all chains of a block when a callback tries to register to a block that already has offloaded rules. If all existing rules cannot be offloaded then the registration is rejected. This replaces the previous policy of rejecting such callback registration outright. On unregistration of a callback, the rules are flushed for that given cb. The implementation of block sharing in the NFP driver, for example, duplicates shared rules to all devs bound to a block. This meant that rules could still exist in hw even after a device is unbound from a block (assuming the block still remains active). Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-26 23:21:33 +09:00
John Hurley	7e916b7680	net: sched: cls_bpf: implement offload tcf_proto_op Add the offload tcf_proto_op in cls_bpf to generate an offload message for each bpf prog in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' prog. A prog contains a flag to indicate if it is in hardware or not. To ensure the offload function properly maintains this flag, keep a reference counter for the number of instances of the prog that are in hardware. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-26 23:21:33 +09:00
John Hurley	530d995123	net: sched: cls_u32: implement offload tcf_proto_op Add the offload tcf_proto_op in cls_u32 to generate an offload message for each filter and the hashtable in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. A filter contains a flag to indicate if it is in hardware or not. To ensure the offload function properly maintains this flag, keep a reference counter for the number of instances of the filter that are in hardware. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-26 23:21:33 +09:00
John Hurley	0efd1b3a13	net: sched: cls_matchall: implement offload tcf_proto_op Add the reoffload tcf_proto_op in matchall to generate an offload message for each filter in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. Ensure matchall flags correctly report if the rule is in hw by keeping a reference counter for the number of instances of the rule offloaded. Only update the flag when this counter changes from or to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-26 23:21:33 +09:00
John Hurley	31533cba43	net: sched: cls_flower: implement offload tcf_proto_op Add the reoffload tcf_proto_op in flower to generate an offload message for each filter in the given tcf_proto. Call the specified callback with this new offload message. The function only returns an error if the callback rejects adding a 'hardware only' rule. A filter contains a flag to indicate if it is in hardware or not. To ensure the reoffload function properly maintains this flag, keep a reference counter for the number of instances of the filter that are in hardware. Only update the flag when this counter changes from or to 0. Add a generic helper function to implement this behaviour. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-26 23:21:32 +09:00
John Hurley	60513bd82c	net: sched: pass extack pointer to block binds and cb registration Pass the extact struct from a tc qdisc add to the block bind function and, in turn, to the setup_tc ndo of binding device via the tc_block_offload struct. Pass this back to any block callback registrations to allow netlink logging of fails in the bind process. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-26 23:21:32 +09:00
Guillaume Nault	2685fbb804	l2tp: make l2tp_xmit_core() return void It always returns 0, and nobody reads the return value anyway. Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-26 22:55:51 +09:00

... 15 16 17 18 19 ...

52701 Commits