NEXTHDR_MAX is 255. What happens here is that we take a u8 value
"hdr->nexthdr" from the network and then look it up in
lowpan_nexthdr_nhcs[]. The problem is that if hdr->nexthdr is 0xff then
we read one element beyond the end of the array so the array needs to
be one element larger.
Fixes: 92aa7c65d2 ("6lowpan: add generic nhc layer interface")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Acked-by: Alexander Aring <aring@mojatatu.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Struct ca is copied from userspace. It is not checked whether the "name"
field is NULL terminated, which allows local users to obtain potentially
sensitive information from kernel stack memory, via a HIDPCONNADD command.
This vulnerability is similar to CVE-2011-1079.
Signed-off-by: Young Xiao <YangX92@hotmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org
Now that we use skb-less flow dissector let's return true nhoff and
thoff. We used to adjust them by ETH_HLEN because that's how it was
done in the skb case. For VLAN tests that looks confusing: nhoff is
pointing to vlan parts :-\
Warning, this is an API change for BPF_PROG_TEST_RUN! Feel free to drop
if you think that it's too late at this point to fix it.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
When called without skb, gather all required data from the
__skb_flow_dissect's arguments and use recently introduces
no-skb mode of bpf flow dissector.
Note: WARN_ON_ONCE(!net) will now trigger for eth_get_headlen users.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This new argument will be used in the next patches for the
eth_get_headlen use case. eth_get_headlen calls flow dissector
with only data (without skb) so there is currently no way to
pull attached BPF flow dissector program. With this new argument,
we can amend the callers to explicitly pass network namespace
so we can use attached BPF program.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Now that we have bpf_flow_dissect which can work on raw data,
use it when doing BPF_PROG_TEST_RUN for flow dissector.
Simplifies bpf_prog_test_run_flow_dissector and allows us to
test no-skb mode.
Note, that previously, with bpf_flow_dissect_skb we used to call
eth_type_trans which pulled L2 (ETH_HLEN) header and we explicitly called
skb_reset_network_header. That means flow_keys->nhoff would be
initialized to 0 (skb_network_offset) in init_flow_keys.
Now we call bpf_flow_dissect with nhoff set to ETH_HLEN and need
to undo it once the dissection is done to preserve the existing behavior.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
struct bpf_flow_dissector has a small subset of sk_buff fields that
flow dissector BPF program is allowed to access and an optional
pointer to real skb. Real skb is used only in bpf_skb_load_bytes
helper to read non-linear data.
The real motivation for this is to be able to call flow dissector
from eth_get_headlen context where we don't have an skb and need
to dissect raw bytes.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add return check for security level set for socket interface since
stack will check the return value.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
l2cap_le_flowctl_init was reseting the tx_credits which works only for
outgoing connection since that set the tx_credits on the response, for
incoming connections that was not the case which leaves the channel
without any credits causing it to be suspended.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org # 4.20+
We need to dereference the directory to get its parent to
be able to rename it, so it's clearly not safe to try to
do this with ERR_PTR() pointers. Skip in this case.
It seems that this is most likely what was causing the
report by syzbot, but I'm not entirely sure as it didn't
come with a reproducer this time.
Cc: stable@vger.kernel.org
Reported-by: syzbot+4ece1a28b8f4730547c9@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Commit c82c06ce43d3("cfg80211: Notify all User Hints To self managed wiphys")
notified all new user hints to self managed wiphy's after device registration.
But it didn't do this for anything other than cell base hints done before
registration.
This needs to be done during wiphy registration of a self managed device also,
so that the previous user settings are retained.
Fixes: c82c06ce43 ("cfg80211: Notify all User Hints To self managed wiphys")
Signed-off-by: Sriram R <srirrama@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The txq of vif is added to active_txqs list for ATF TXQ scheduling
in the function ieee80211_queue_skb(), but it was not properly removed
before freeing the txq object. It was causing use after free of the txq
objects from the active_txqs list, result was kernel panic
due to invalid memory access.
Fix kernel invalid memory access by properly removing txq object
from active_txqs list before free the object.
Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
None of them have any external callers, make them static.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
No external dependencies, might as well handle this directly.
xfrm_afinfo_policy is now 40 bytes on x86_64.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Only used by ipv4, we can read the fl4 tos value directly instead.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The Kconfig currently controlling compilation of this code is:
net/strparser/Kconfig:config STREAM_PARSER
net/strparser/Kconfig: def_bool n
...meaning that it currently is not being built as a module by anyone.
Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.
Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit. For
clarity, we change the fcn name mod_init to dev_init at the same time.
We replace module.h with init.h and export.h ; the latter since this
file exports some syms.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Kconfig controlling this code is:
bpfilter/Kconfig:menuconfig BPFILTER
bpfilter/Kconfig: bool "BPF based packet filtering framework (BPFILTER)"
Since it isn't a module, we shouldn't use module_init(). Instead we
use device_initcall() - which is exactly what module_init() defaults
to for non-modular code/builds.
We don't remove <linux/module.h> from the includes since this file does
a request_module() and hence is a valid user of that header file, even
though it is not modular itself.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Kconfig currently controlling compilation of this code is:
net/Kconfig:config CGROUP_NET_PRIO
net/Kconfig: bool "Network priority cgroup"
...meaning that it currently is not being built as a module by anyone,
as module support was discontinued in 2014.
We delete the MODULE_LICENSE tag since all that information is already
contained at the top of the file in the comments.
We don't delete module.h from the includes since it was no longer there
to begin with.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Rosen, Rami" <rami.rosen@intel.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: netdev@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The header contains rtnh_ macros so rename the file accordingly.
Allows a later patch to use the nexthop.h name for the new
nexthop code.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf-next 2019-04-22
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) allow stack/queue helpers from more bpf program types, from Alban.
2) allow parallel verification of root bpf programs, from Alexei.
3) introduce bpf sysctl hook for trusted root cases, from Andrey.
4) recognize var/datasec in btf deduplication, from Andrii.
5) cpumap performance optimizations, from Jesper.
6) verifier prep for alu32 optimization, from Jiong.
7) libbpf xsk cleanup, from Magnus.
8) other various fixes and cleanups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree:
1) Add a selftest for icmp packet too big errors with conntrack, from
Florian Westphal.
2) Validate inner header in ICMP error message does not lie to us
in conntrack, also from Florian.
3) Initialize ct->timeout to calm down KASAN, from Alexander Potapenko.
4) Skip ICMP error messages from tunnels in IPVS, from Julian Anastasov.
5) Use a hash to expose conntrack and expectation ID, from Florian Westphal.
6) Prevent shift wrap in nft_chain_parse_hook(), from Dan Carpenter.
7) Fix broken ICMP ID randomization with NAT, also from Florian.
8) Remove WARN_ON in ebtables compat that is reached via syzkaller,
from Florian Westphal.
9) Fix broken timestamps since fb420d5d91 ("tcp/fq: move back to
CLOCK_MONOTONIC"), from Florian.
10) Fix logging of invalid packets in conntrack, from Andrei Vagin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It doesn't log a packet if sysctl_log_invalid isn't equal to protonum
OR sysctl_log_invalid isn't equal to IPPROTO_RAW. This sentence is
always true. I believe we need to replace OR to AND.
Cc: Florian Westphal <fw@strlen.de>
Fixes: c4f3db1595 ("netfilter: conntrack: add and use nf_l4proto_log_invalid")
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
setting net.netfilter.nf_conntrack_timestamp=1 breaks xmit with fq
scheduler. skb->tstamp might be "refreshed" using ktime_get_real(),
but fq expects CLOCK_MONOTONIC.
This patch removes all places in netfilter that check/set skb->tstamp:
1. To fix the bogus "start" time seen with conntrack timestamping for
outgoing packets, never use skb->tstamp and always use current time.
2. In nfqueue and nflog, only use skb->tstamp for incoming packets,
as determined by current hook (prerouting, input, forward).
3. xt_time has to use system clock as well rather than skb->tstamp.
We could still use skb->tstamp for prerouting/input/foward, but
I see no advantage to make this conditional.
Fixes: fb420d5d91 ("tcp/fq: move back to CLOCK_MONOTONIC")
Cc: Eric Dumazet <edumazet@google.com>
Reported-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pointers should be printed with %p or %px rather than
cast to long type and printed with %8.8lx.
Change %8.8lx to %p to print the pointer.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When device refuses the offload in tls_set_device_offload_rx()
it calls tls_sw_free_resources_rx() to clean up software context
state.
Unfortunately, tls_sw_free_resources_rx() does not free all
the state tls_set_sw_offload() allocated - it leaks IV and
sequence number buffers. All other code paths which lead to
tls_sw_release_resources_rx() (which tls_sw_free_resources_rx()
calls) free those right before the call.
Avoid the leak by moving freeing of iv and rec_seq into
tls_sw_release_resources_rx().
Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If device supports offload, but offload fails tls_set_device_offload_rx()
will call tls_sw_free_resources_rx() which (unhelpfully) releases
and reacquires the socket lock.
For a small fix release and reacquire the device_offload_lock.
Fixes: 4799ac81e5 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull NFS client bugfix from Trond Myklebust:
"Fix a regression in which an RPC call can be tagged with an error
despite the transmission being successful"
* tag 'nfs-for-5.1-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Ignore queue transmission errors on successful transmission
tcp sendmsg() and sendpage() normally advance skb->data_len
and skb->truesize by the payload added to an skb.
But sendmsg(fd, ..., MSG_ZEROCOPY) has to account for whole pages,
even if a single byte of payload is used in the page.
This means that we can not assume skb->truesize can be adjusted
by skb->data_len. We must instead overwrite its value.
Otherwise skb->truesize is too big and can hit socket sndbuf limit,
especially if the skb is recycled multiple times :/
Fixes: 472c2e07ee ("tcp: add one skb cache for tx")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the
number of buffers in receive socket buffer which is not so helpful
for user space applications.
This commit introduces the new option TIPC_SOCK_RECVQ_USED which
returns the current allocated bytes of the receive socket buffer.
This helps user space applications dimension its buffer usage to
avoid buffer overload issue.
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'timeval' and 'timespec' data structures used for socket timestamps
are going to be redefined in user space based on 64-bit time_t in future
versions of the C library to deal with the y2038 overflow problem,
which breaks the ABI definition.
Unlike many modern ioctl commands, SIOCGSTAMP and SIOCGSTAMPNS do not
use the _IOR() macro to encode the size of the transferred data, so it
remains ambiguous whether the application uses the old or new layout.
The best workaround I could find is rather ugly: we redefine the command
code based on the size of the respective data structure with a ternary
operator. This lets it get evaluated as late as possible, hopefully after
that structure is visible to the caller. We cannot use an #ifdef here,
because inux/sockios.h might have been included before any libc header
that could determine the size of time_t.
The ioctl implementation now interprets the new command codes as always
referring to the 64-bit structure on all architectures, while the old
architecture specific command code still refers to the old architecture
specific layout. The new command number is only used when they are
actually different.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
socket protocol handlers, and all of those end up calling the same
sock_get_timestamp()/sock_get_timestampns() helper functions, which
results in a lot of duplicate code.
With the introduction of 64-bit time_t on 32-bit architectures, this
gets worse, as we then need four different ioctl commands in each
socket protocol implementation.
To simplify that, let's add a new .gettstamp() operation in
struct proto_ops, and move ioctl implementation into the common
sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
through.
We can reuse the sock_get_timestamp() implementation, but generalize
it so it can deal with both native and compat mode, as well as
timeval and timespec structures.
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If vlan bridge binding is enabled, then the link state of a vlan device
that is an upper device of the bridge tracks the state of bridge ports
that are members of that vlan. But this can only be done when the link
state of the bridge is up. If it is down, then the link state of the
vlan devices must also be down. This is to maintain existing behavior
for when STP is enabled and there are no live ports, in which case the
link state for the bridge and any vlan devices is down.
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If vlan bridge binding is enabled, then the link state of a vlan device
that is an upper device of the bridge should track the state of bridge
ports that are members of that vlan. So if a bridge port becomes or
stops being a member of a vlan, then update the link state of the
vlan device if necessary.
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the case of vlan filtering on bridges, the bridge may also have the
corresponding vlan devices as upper devices. A vlan bridge binding mode
is added to allow the link state of the vlan device to track only the
state of the subset of bridge ports that are also members of the vlan,
rather than that of all bridge ports. This mode is set with a vlan flag
rather than a bridge sysfs so that the 8021q module is aware that it
should not set the link state for the vlan device.
If bridge vlan is configured, the bridge device event handling results
in the link state for an upper device being set, if it is a vlan device
with the vlan bridge binding mode enabled. This also sets a
vlan_bridge_binding flag so that subsequent UP/DOWN/CHANGE events for
the ports in that bridge result in a link state update of the vlan
device if required.
The link state of the vlan device is up if there is at least one bridge
port that is a vlan member that is admin & oper up, otherwise its oper
state is IF_OPER_LOWERLAYERDOWN.
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In vlan bridge binding mode, the link state is no longer transferred
from the lower device. Instead it is set by the bridge module according
to the state of bridge ports that are members of the vlan.
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the case of vlan filtering on bridges, the bridge may also have the
corresponding vlan devices as upper devices. Currently the link state
of vlan devices is transferred from the lower device. So this is up if
the bridge is in admin up state and there is at least one bridge port
that is up, regardless of the vlan that the port is a member of.
The link state of the vlan device may need to track only the state of
the subset of ports that are also members of the corresponding vlan,
rather than that of all ports.
Add a flag to specify a vlan bridge binding mode, by which the link
state is no longer automatically transferred from the lower device,
but is instead determined by the bridge ports that are members of the
vlan.
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent changes to taprio did not use the correct div64 helpers,
leading to:
net/sched/sch_taprio.o: In function `taprio_dequeue':
sch_taprio.c:(.text+0x34a): undefined reference to `__divdi3'
net/sched/sch_taprio.o: In function `advance_sched':
sch_taprio.c:(.text+0xa0b): undefined reference to `__divdi3'
net/sched/sch_taprio.o: In function `taprio_init':
sch_taprio.c:(.text+0x1450): undefined reference to `__divdi3'
/home/jkicinski/devel/linux/Makefile:1032: recipe for target 'vmlinux' failed
Use math64 helpers.
Fixes: 7b9eba7ba0 ("net/sched: taprio: fix picos_per_byte miscalculation")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GCC complains:
net/l2tp/l2tp_ppp.c: In function ‘pppol2tp_ioctl’:
net/l2tp/l2tp_ppp.c:1073:6: warning: variable ‘val’ set but not used [-Wunused-but-set-variable]
int val;
^~~
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To make ICMPv6 closer to ICMPv4, add ratemask parameter. Since the ICMP
message types use larger numeric values, a simple bitmask doesn't fit.
I use large bitmap. The input and output are the in form of list of
ranges. Set the default to rate limit all error messages but Packet Too
Big. For Packet Too Big, use ratemask instead of hard-coded.
There are functions where icmpv6_xrlim_allow() and icmpv6_global_allow()
aren't called. This patch only adds them to icmpv6_echo_reply().
Rate limiting error messages is mandated by RFC 4443 but RFC 4890 says
that it is also acceptable to rate limit informational messages. Thus,
I removed the current hard-coded behavior of icmpv6_mask_allow() that
doesn't rate limit informational messages.
v2: Add dummy function proc_do_large_bitmap() if CONFIG_PROC_SYSCTL
isn't defined, expand the description in ip-sysctl.txt and remove
unnecessary conditional before kfree().
v3: Inline the bitmap instead of dynamically allocated. Still is a
pointer to it is needed because of the way proc_do_large_bitmap work.
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike atomic_add(), refcount_add() does not deal well
with a negative argument. TLS fallback code reallocates
the skb and is very likely to shrink the truesize, leading to:
[ 189.513254] WARNING: CPU: 5 PID: 0 at lib/refcount.c:81 refcount_add_not_zero_checked+0x15c/0x180
Call Trace:
refcount_add_checked+0x6/0x40
tls_enc_skb+0xb93/0x13e0 [tls]
Once wmem_allocated count saturates the application can no longer
send data on the socket. This is similar to Eric's fixes for GSO,
TCP:
commit 7ec318feee ("tcp: gso: avoid refcount_t warning from tcp_gso_segment()")
and UDP:
commit 575b65bc5b ("udp: avoid refcount_t saturation in __udp_gso_segment()").
Unlike the GSO case, for TLS fallback it's likely that the skb has
shrunk, so the "likely" annotation is the other way around (likely
branch being "sub").
Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull RCU and LKMM commits from Paul E. McKenney:
- An LKMM commit adding support for synchronize_srcu_expedited()
- A couple of straggling RCU flavor consolidation updates
- Documentation updates.
- Miscellaneous fixes
- SRCU updates
- RCU CPU stall-warning updates
- Torture-test updates
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Disabling IPv6 on an interface removes existing entries but nothing prevents
new entries from being manually added. To that end, add a new neigh_table
operation, allow_add, that is called on RTM_NEWNEIGH to see if neighbor
entries are allowed on a given device. If IPv6 is disabled on the device,
allow_add returns false and passes a message back to the user via extack.
$ echo 1 > /proc/sys/net/ipv6/conf/eth1/disable_ipv6
$ ip -6 neigh add fe80::4c88:bff:fe21:2704 dev eth1 lladdr de:ad:be:ef:01:01
Error: IPv6 is disabled on this device.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the fib6_flags and fib6_type to fib6_result. Update the lookup helpers
to set them and update post fib lookup users to use the version from the
result.
This allows nexthop objects to have blackhole nexthop.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change fib6_lookup and fib6_table_lookup to take a fib6_result and set
f6i and nh rather than returning a fib6_info. For now both always
return 0.
A later patch set can make these more like the IPv4 counterparts and
return EINVAL, EACCESS, etc based on fib6_type.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>