Commit Graph

44719 Commits

Author SHA1 Message Date
Sowmini Varadhan
11bb62f7c0 RDS: Do not send a pong to an incoming ping with 0 src port
RDS ping messages are sent with a non-zero src port to a zero
dst port, so that the rds pong messages can be sent back to the
originators src port. However if a confused/malicious sender
sends a ping with a 0 src port, we'd have an infinite ping-pong
loop. To avoid this, the receiver should ignore ping messages
with a 0 src port.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:18 -04:00
Sowmini Varadhan
8315011ad6 RDS: TCP: Simplify reconnect to avoid duelling reconnnect attempts
When reconnecting, the peer with the smaller IP address will initiate
the reconnect, to avoid needless duelling SYN issues.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:17 -04:00
Sowmini Varadhan
b04e8554f7 RDS: TCP: Hooks to set up a single connection path
This patch adds ->conn_path_connect callbacks in the rds_transport
that are used to set up a single connection path.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:17 -04:00
Sowmini Varadhan
2da43c4a1b RDS: TCP: make receive path use the rds_conn_path
The ->sk_user_data contains a pointer to the rds_conn_path
for the socket. Use this consistently in the rds_tcp_data_ready
callbacks to get the rds_conn_path for rds_recv_incoming.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:17 -04:00
Sowmini Varadhan
ea3b1ea539 RDS: TCP: make ->sk_user_data point to a rds_conn_path
The socket callbacks should all operate on a struct rds_conn_path,
in preparation for a MP capable RDS-TCP.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:17 -04:00
Sowmini Varadhan
afb4164d91 RDS: TCP: Refactor connection destruction to handle multiple paths
A single rds_connection may have multiple rds_conn_paths that have
to be carefully and correctly destroyed, for both rmmod and
netns-delete cases.

For both cases, we extract a single rds_tcp_connection for
each conn into a temporary list, and then invoke rds_conn_destroy()
which iteratively dismantles every path in the rds_connection.

For the netns deletion case, we additionally have to make sure
that we do not leave a socket in TIME_WAIT state, as this will
hold up the netns deletion. Thus we call rds_tcp_conn_paths_destroy()
to reset state quickly.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:17 -04:00
Sowmini Varadhan
02105b2ccd RDS: TCP: Make rds_tcp_connection track the rds_conn_path
The struct rds_tcp_connection is the transport-specific private
data structure that tracks TCP information per rds_conn_path.
Modify this structure to have a back-pointer to the rds_conn_path
for which it is the ->cp_transport_data.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:17 -04:00
Sowmini Varadhan
26e4e6bb68 RDS: TCP: Remove dead logic around c_passive in rds-tcp
The c_passive bit is only intended for the IB transport and will
never be encountered in rds-tcp, so remove the dead logic that
predicates on this bit.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:17 -04:00
Sowmini Varadhan
226f7a7d97 RDS: Rework path specific indirections
Refactor code to avoid separate indirections for single-path
and multipath transports. All transports (both single and mp-capable)
will get a pointer to the rds_conn_path, and can trivially derive
the rds_connection from the ->cp_conn.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:45:17 -04:00
Martin KaFai Lau
4a482f34af cgroup: bpf: Add bpf_skb_in_cgroup_proto
Adds a bpf helper, bpf_skb_in_cgroup, to decide if a skb->sk
belongs to a descendant of a cgroup2.  It is similar to the
feature added in netfilter:
commit c38c4597e4 ("netfilter: implement xt_cgroup cgroup2 path match")

The user is expected to populate a BPF_MAP_TYPE_CGROUP_ARRAY
which will be used by the bpf_skb_in_cgroup.

Modifications to the bpf verifier is to ensure BPF_MAP_TYPE_CGROUP_ARRAY
and bpf_skb_in_cgroup() are always used together.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:32:13 -04:00
WANG Cong
82a31b9231 net_sched: fix mirrored packets checksum
Similar to commit 9b368814b3 ("net: fix bridge multicast packet checksum validation")
we need to fixup the checksum for CHECKSUM_COMPLETE when
pushing skb on RX path. Otherwise we get similar splats.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:19:34 -04:00
David S. Miller
eb70db8756 packet: Use symmetric hash for PACKET_FANOUT_HASH.
People who use PACKET_FANOUT_HASH want a symmetric hash, meaning that
they want packets going in both directions on a flow to hash to the
same bucket.

The core kernel SKB hash became non-symmetric when the ipv6 flow label
and other entities were incorporated into the standard flow hash order
to increase entropy.

But there are no users of PACKET_FANOUT_HASH who want an assymetric
hash, they all want a symmetric one.

Therefore, use the flow dissector to compute a flat symmetric hash
over only the protocol, addresses and ports.  This hash does not get
installed into and override the normal skb hash, so this change has
no effect whatsoever on the rest of the stack.

Reported-by: Eric Leblond <eric@regit.org>
Tested-by: Eric Leblond <eric@regit.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:07:50 -04:00
Daniel Borkmann
113214be7f bpf: refactor bpf_prog_get and type check into helper
Since bpf_prog_get() and program type check is used in a couple of places,
refactor this into a small helper function that we can make use of. Since
the non RO prog->aux part is not used in performance critical paths and a
program destruction via RCU is rather very unlikley when doing the put, we
shouldn't have an issue just doing the bpf_prog_get() + prog->type != type
check, but actually not taking the ref at all (due to being in fdget() /
fdput() section of the bpf fd) is even cleaner and makes the diff smaller
as well, so just go for that. Callsites are changed to make use of the new
helper where possible.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 16:00:47 -04:00
Moritz Sichert
f1504307b9 netfilter: Remove references to obsolete CONFIG_IP_ROUTE_FWMARK
This option was removed in commit 47dcf0cb10 ("[NET]: Rethink mark field
in struct flowi").

Signed-off-by: Moritz Sichert <moritz+linux@sichert.me>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01 16:37:07 +02:00
Joe Perches
4ae89ad924 etherdevice.h & bridge: netfilter: Add and use ether_addr_equal_masked
There are code duplications of a masked ethernet address comparison here
so make it a separate function instead.

Miscellanea:

o Neaten alignment of FWINV macro uses to make it clearer for the reader

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01 16:37:06 +02:00
Pablo Neira Ayuso
468b021b94 netfilter: x_tables: simplify ip{6}table_mangle_hook()
No need for a special case to handle NF_INET_POST_ROUTING, this is
basically the same handling as for prerouting, input, forward.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01 16:37:02 +02:00
Florian Westphal
9cc1c73ad6 netfilter: conntrack: avoid integer overflow when resizing
Can overflow so we might allocate very small table when bucket count is
high on a 32bit platform.

Note: resize is only possible from init_netns.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-01 16:02:33 +02:00
Jason Wang
08294a26e1 net: introduce NETDEV_CHANGE_TX_QUEUE_LEN
This patch introduces a new event - NETDEV_CHANGE_TX_QUEUE_LEN, this
will be triggered when tx_queue_len. It could be used by net device
who want to do some processing at that time. An example is tun who may
want to resize tx array when tx_queue_len is changed.

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:32:17 -04:00
Michal Soltys
33ef84a77d net/sched/sch_hfsc.c: anchor virtual curve at proper vt in hfsc_change_fsc()
cl->cl_vt alone is relative only to the current backlog period, while
the curve operates on cumulative virtual time. This patch adds missing
cl->cl_vtoff.

Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:03:43 -04:00
Michal Soltys
ab12cb4742 net/sched/sch_hfsc.c: go passive after vt update
When a class is going passive, it should update its cl_vt first
to be consistent with the last dequeue operation.

Otherwise its cl_vt will be one packet behind and parent's cvtmax might
not be updated as well.

One possible side effect is if some class goes passive and subsequently
goes active /without/ its parent going passive - with cl_vt lagging one
packet behind - comparison made in init_vf() will be affected (same
period).

Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:03:43 -04:00
Michal Soltys
2354f056f6 net/sched/sch_hfsc.c: remove leftover dlist and droplist
This is update to:
commit a09ceb0e08 ("sched: remove qdisc->drop")

That commit removed qdisc->drop, but left alone dlist and droplist
that no longer serve any meaningful purpose.

Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:03:43 -04:00
Michal Soltys
d1d0fc5e4c net/sched/sch_hfsc.c: add unlikely() in qdisc_peek_len()
The condition can only succeed on wrong configurations.

Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:03:43 -04:00
Michal Soltys
12d0ad3be9 net/sched/sch_hfsc.c: handle corner cases where head may change invalidating calculated deadline
Realtime scheduling implemented in HFSC uses head of the queue to make
the decision about which packet to schedule next. But in case of any
head drop, the deadline calculated for the previous head is not
necessarily correct for the next head (unless both packets have the same
length).

Thanks to peek() function used during dequeue - which internally is a
dequeue operation - hfsc is almost safe from this issue, as peek()
dequeues and isolates the head storing it temporarily until the real
dequeue happens.

But there is one exception: if after the class activation a drop happens
before the first dequeue operation, there's never a chance to do the
peek().

Adding peek() call in enqueue - if this is the first packet in a new
backlog period AND the scheduler has realtime curve defined - fixes that
one corner case. The 1st hfsc_dequeue() will use that peeked packet,
similarly as every subsequent hfsc_dequeue() call uses packet peeked by
the previous call.

Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:03:43 -04:00
Eric Dumazet
19689e38ec tcp: md5: use kmalloc() backed scratch areas
Some arches have virtually mapped kernel stacks, or will soon have.

tcp_md5_hash_header() uses an automatic variable to copy tcp header
before mangling th->check and calling crypto function, which might
be problematic on such arches.

David says that using percpu storage is also problematic on non SMP
builds.

Just use kmalloc() to allocate scratch areas.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 04:02:55 -04:00
David Howells
ac5d26836c rxrpc: Fix processing of authenticated/encrypted jumbo packets
When a jumbo packet is being split up and processed, the crypto checksum
for each split-out packet is in the jumbo header and needs placing in the
reconstructed packet header.

When the code was changed to keep the stored copy of the packet header in
host byte order, this reconstruction was missed.

Found with sparse with CF=-D__CHECK_ENDIAN__:

    ../net/rxrpc/input.c:479:33: warning: incorrect type in assignment (different base types)
    ../net/rxrpc/input.c:479:33:    expected unsigned short [unsigned] [usertype] _rsvd
    ../net/rxrpc/input.c:479:33:    got restricted __be16 [addressable] [usertype] _rsvd

Fixes: 0d12f8a402 ("rxrpc: Keep the skb private record of the Rx header in host byte order")
Signed-off-by: David Howells <dhowells@redhat.com>
2016-07-01 08:35:02 +01:00
Shmulik Ladkani
fedbb6b4ff ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output
ip_skb_dst_mtu uses skb->sk, assuming it is an AF_INET socket (e.g. it
calls ip_sk_use_pmtu which casts sk as an inet_sk).

However, in the case of UDP tunneling, the skb->sk is not necessarily an
inet socket (could be AF_PACKET socket, or AF_UNSPEC if arriving from
tun/tap).

OTOH, the sk passed as an argument throughout IP stack's output path is
the one which is of PMTU interest:
 - In case of local sockets, sk is same as skb->sk;
 - In case of a udp tunnel, sk is the tunneling socket.

Fix, by passing ip_finish_output's sk to ip_skb_dst_mtu.
This augments 7026b1ddb6 'netfilter: Pass socket pointer down through okfn().'

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 09:02:48 -04:00
Mateusz Bajorski
153380ec4b fib_rules: Added NLM_F_EXCL support to fib_nl_newrule
When adding rule with NLM_F_EXCL flag then check if the same rule exist.
If yes then exit with -EEXIST.

This is already implemented in iproute2:
        if (cmd == RTM_NEWRULE) {
                req.n.nlmsg_flags |= NLM_F_CREATE|NLM_F_EXCL;
                req.r.rtm_type = RTN_UNICAST;
        }

Tested ipv4 and ipv6 with net-next linux on qemu x86

expected behavior after patch:
localhost ~ # ip rule
0:    from all lookup local
32766:    from all lookup main
32767:    from all lookup default
localhost ~ # ip rule add from 10.46.177.97 lookup 104 pref 1005
localhost ~ # ip rule add from 10.46.177.97 lookup 104 pref 1005
RTNETLINK answers: File exists
localhost ~ # ip rule
0:    from all lookup local
1005:    from 10.46.177.97 lookup 104
32766:    from all lookup main
32767:    from all lookup default

There was already topic regarding this but I don't see any changes
merged and problem still occurs.
https://lkml.kernel.org/r/1135778809.5944.7.camel+%28%29+localhost+%21+localdomain

Signed-off-by: Mateusz Bajorski <mateusz.bajorski@nokia.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:23:19 -04:00
Andrey Vagin
b1ed4c4fa9 tcp: add an ability to dump and restore window parameters
We found that sometimes a restored tcp socket doesn't work.

A reason of this bug is incorrect window parameters and in this case
tcp_acceptable_seq() returns tcp_wnd_end(tp) instead of tp->snd_nxt. The
other side drops packets with this seq, because seq is less than
tp->rcv_nxt ( tcp_sequence() ).

Data from a send queue is sent only if there is enough space in a
window, so when we restore unacked data, we need to expand a window to
fit this data.

This was in a first version of this patch:
"tcp: extend window to fit all restored unacked data in a send queue"

Then Alexey recommended me to restore window parameters instead of
adjusted them according with data in a sent queue. This sounds resonable.

rcv_wnd has to be restored, because it was reported to another side
and the offered window is never shrunk.
One of reasons why we need to restore snd_wnd was described above.

Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 08:15:31 -04:00
Nikolay Aleksandrov
1080ab95e3 net: bridge: add support for IGMP/MLD stats and export them via netlink
This patch adds stats support for the currently used IGMP/MLD types by the
bridge. The stats are per-port (plus one stat per-bridge) and per-direction
(RX/TX). The stats are exported via netlink via the new linkxstats API
(RTM_GETSTATS). In order to minimize the performance impact, a new option
is used to enable/disable the stats - multicast_stats_enabled, similar to
the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
lookups and checks, we make use of the current "igmp" member of the bridge
private skb->cb region to record the type on Rx (both host-generated and
external packets pass by multicast_rcv()). We can do that since the igmp
member was used as a boolean and all the valid IGMP/MLD types are positive
values. The normal bridge fast-path is not affected at all, the only
affected paths are the flooding ones and since we make use of the IGMP/MLD
type, we can quickly determine if the packet should be counted using
cache-hot data (cb's igmp member). We add counters for:
* IGMP Queries
* IGMP Leaves
* IGMP v1/v2/v3 reports

* MLD Queries
* MLD Leaves
* MLD v1/v2 reports

These are invaluable when monitoring or debugging complex multicast setups
with bridges.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 06:18:24 -04:00
Nikolay Aleksandrov
80e73cc563 net: rtnetlink: add support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute
This patch adds support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute
which allows to export per-slave statistics if the master device supports
the linkxstats callback. The attribute is passed down to the linkxstats
callback and it is up to the callback user to use it (an example has been
added to the only current user - the bridge). This allows us to query only
specific slaves of master devices like bridge ports and export only what
we're interested in instead of having to dump all ports and searching only
for a single one. This will be used to export per-port IGMP/MLD stats and
also per-port vlan stats in the future, possibly other statistics as well.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 06:15:04 -04:00
Michal Kazior
59a7c828d7 mac80211: fix fq lockdep warnings
Some lockdep assertions were not fulfilled and
resulted in a kernel warning/call trace if driver
used intermediate software queues (e.g. ath10k).

Existing code sequences should've guaranteed safety
but it's always good to be extra careful.

The call trace could look like this:

 [ 237.335805] ------------[ cut here ]------------
 [ 237.335852] WARNING: CPU: 3 PID: 1921 at include/net/fq_impl.h:22 fq_flow_dequeue+0xed/0x140 [mac80211]
 [ 237.335855] Modules linked in: ath10k_pci(E-) ath10k_core(E) ath(E) mac80211(E) cfg80211(E)
 [ 237.335913] CPU: 3 PID: 1921 Comm: rmmod Tainted: G        W   E   4.7.0-rc4-wt-ath+ #1377
 [ 237.335916] Hardware name: Hewlett-Packard HP ProBook 6540b/1722, BIOS 68CDD Ver. F.04 01/27/2010
 [ 237.335918]  00200286 00200286 eff85dac c14151e2 f901574e 00000000 eff85de0 c1081075
 [ 237.335928]  c1ab91f0 00000003 00000781 f901574e 00000016 f8fbabad f8fbabad 00000016
 [ 237.335938]  eb24ff60 00000000 ef3886c0 eff85df4 c10810ba 00000009 00000000 00000000
 [ 237.335948] Call Trace:
 [ 237.335953]  [<c14151e2>] dump_stack+0x76/0xb4
 [ 237.335957]  [<c1081075>] __warn+0xe5/0x100
 [ 237.336002]  [<f8fbabad>] ? fq_flow_dequeue+0xed/0x140 [mac80211]
 [ 237.336046]  [<f8fbabad>] ? fq_flow_dequeue+0xed/0x140 [mac80211]
 [ 237.336053]  [<c10810ba>] warn_slowpath_null+0x2a/0x30
 [ 237.336095]  [<f8fbabad>] fq_flow_dequeue+0xed/0x140 [mac80211]
 [ 237.336137]  [<f8fbc67a>] fq_flow_reset.constprop.56+0x2a/0x90 [mac80211]
 [ 237.336180]  [<f8fbc79a>] fq_reset.constprop.59+0x2a/0x50 [mac80211]
 [ 237.336222]  [<f8fc04e8>] ieee80211_txq_teardown_flows+0x38/0x40 [mac80211]
 [ 237.336258]  [<f8f7c1a4>] ieee80211_unregister_hw+0xe4/0x120 [mac80211]
 [ 237.336275]  [<f933f536>] ath10k_mac_unregister+0x16/0x50 [ath10k_core]
 [ 237.336292]  [<f934592d>] ath10k_core_unregister+0x3d/0x90 [ath10k_core]
 [ 237.336301]  [<f85f8836>] ath10k_pci_remove+0x36/0xa0 [ath10k_pci]
 [ 237.336307]  [<c1470388>] pci_device_remove+0x38/0xb0
 ...

Fixes: 5caa328e38 ("mac80211: implement codel on fair queuing flows")
Fixes: fa962b9212 ("mac80211: implement fair queueing per txq")
Tested-by: Kalle Valo <kvalo@qca.qualcomm.com>
Reported-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:07:44 +02:00
Bob Copeland
efc401f49a mac80211: use common cleanup for user/!user_mpm
We've accumulated a couple of different fixes now to mesh_sta_cleanup()
due to the different paths that user_mpm and !user_mpm cases take -- one
fix to flush nexthop paths and one to fix the counting.

The only caller of mesh_plink_deactivate() is mesh_sta_cleanup(), so we
can push the user_mpm checks down into there in order to share more
code.

In doing so, we can remove an extra call to mesh_path_flush_by_nexthop()
and the (unnecessary) call to mesh_accept_plinks_update().  This will
also ensure the powersaving state code gets called in the user_mpm case.

The only cleanup tasks we need to avoid when MPM is in user-space
are sending the peering frames and stopping the plink timer, so wrap
those in the appropriate check.

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:41 +02:00
Masashi Honma
46f6b06050 mac80211: Encrypt "Group addressed privacy" action frames
Previously, the action frames to group address was not encrypted. But
[1] "Table 8-38 Category values" indicates "Mesh" and "Multihop" category
action frames should be encrypted (Group addressed privacy == yes). And the
encyption key should be MGTK ([1] 10.13 Group addressed robust management frame
procedures). So this patch modifies the code to make it suitable for spec.

[1] IEEE Std 802.11-2012

Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:20 +02:00
Dan Carpenter
49708e3772 mac80211: silence an uninitialized variable warning
We normally return an uninitialized value, but no one checks it so it
doesn't matter.  Anyway, let's silence the static checker warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:19 +02:00
Arnd Bergmann
f151d9db4c nl80211: improve nl80211_parse_mesh_config type checking
When building a kernel with W=1, the nl80211.c file causes a number of
warnings, all about the same problem:

net/wireless/nl80211.c: In function 'nl80211_parse_mesh_config':
net/wireless/nl80211.c:5287:103: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5290:96: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5293:124: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5295:148: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5298:106: error: comparison is always false due to limited range of data type [-Werror=type-limits]
net/wireless/nl80211.c:5305:116: error: comparison is always false due to limited range of data type [-Werror=type-limits]

The problem is that gcc does not notice that the check is generate
by a macro, so it complains about comparing an unsigned type against 0.

I've tried to come up with a way to rephrase that code in a way that
avoids the warnings and otherwise improves the code as well.

This uses a set of new helper functions that perform the range checking,
and should provide slightly better type safety than the older patch,
at the expense of adding 44 lines to the code. Binary code size is
basically unchanged though (20 bytes added to 126561 bytes .text).

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-30 12:06:18 +02:00
Daniel Borkmann
d2485c4242 bpf: add bpf_skb_change_type helper
This work adds a helper for changing skb->pkt_type in a controlled way.
We only allow a subset of possible values and can extend that in future
should other use cases come up. Doing this as a helper has the advantage
that errors can be handeled gracefully and thus helper kept extensible.

It's a write counterpart to pkt_type member we can already read from
struct __sk_buff context. Major use case is to change incoming skbs to
PACKET_HOST in a programmatic way instead of having to recirculate via
redirect(..., BPF_F_INGRESS), for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
Daniel Borkmann
6578171a7f bpf: add bpf_skb_change_proto helper
This patch adds a minimal helper for doing the groundwork of changing
the skb->protocol in a controlled way. Currently supported is v4 to
v6 and vice versa transitions, which allows f.e. for a minimal, static
nat64 implementation where applications in containers that still
require IPv4 can be transparently operated in an IPv6-only environment.
For example, host facing veth of the container can transparently do
the transitions in a programmatic way with the help of clsact qdisc
and cls_bpf.

Idea is to separate concerns for keeping complexity of the helper
lower, which means that the programs utilize bpf_skb_change_proto(),
bpf_skb_store_bytes() and bpf_lX_csum_replace() to get the job done,
instead of doing everything in a single helper (and thus partially
duplicating helper functionality). Also, bpf_skb_change_proto()
shouldn't need to deal with raw packet data as this is done by other
helpers.

bpf_skb_proto_6_to_4() and bpf_skb_proto_4_to_6() unclone the skb to
operate on a private one, push or pop additionally required header
space and migrate the gso/gro meta data from the shared info. We do
mark the gso type as dodgy so that headers are checked and segs
recalculated by the gso/gro engine. The gso_size target is adapted
as well. The flags argument added is currently reserved and can be
used for future extensions.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
Daniel Borkmann
80b48c4457 bpf: don't use raw processor id in generic helper
Use smp_processor_id() for the generic helper bpf_get_smp_processor_id()
instead of the raw variant. This allows for preemption checks when we
have DEBUG_PREEMPT, and otherwise uses the raw variant anyway. We only
need to keep the raw variant for socket filters, but we can reuse the
helper that is already there from cBPF side.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:54:40 -04:00
David S. Miller
ee58b57100 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, except the packet scheduler
conflicts which deal with the addition of the free list parameter
to qdisc_enqueue().

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:03:36 -04:00
Sven Eckelmann
a2d0816608 batman-adv: Fix bat_(iv|v) function declaration header
The bat_algo.h had some functions declared which were not part of the
bat_algo.c file. These are instead stored in bat_v.c and bat_iv_ogm.c. The
declaration should therefore be also in bat_v.h and bat_iv_ogm,h to make
them easier to find.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Linus Lüssing
4e3e823b5a batman-adv: Add debugfs table for mcast flags
This patch adds a debugfs table with originators and their according
multicast flags to help users figure out why multicast optimizations
might be enabled or disabled for them.

Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Sven Eckelmann
ba412080fb batman-adv: Consolidate logging related functions
There are several places in batman-adv which provide logging related
functions. These should be grouped together in the log.* files to make them
easier to find.

Reported-by: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Linus Lüssing
72f7b2deaf batman-adv: Adding logging of mcast flag changes
With this patch changes relevant to a node's own multicast flags are
printed to the 'mcast' log level.

Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Sven Eckelmann
01d350d147 batman-adv: move bat_algo functions into a separate file
The bat_algo functionality in main.c is mostly unrelated to the rest of the
content. It still takes up a large portion of this source file (~15%, 103
lines). Moving it to a separate file makes it better visible as a main
component of the batman-adv implementation and hides it less in the other
helper functions in main.c.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Linus Lüssing
687937ab34 batman-adv: Add multicast optimization support for bridged setups
With this patch we are finally able to support multicast optimizations
in bridged setups, too. So far, if a bridge was added on top of a
soft-interface (e.g. bat0) the batman-adv multicast optimizations
needed to be disabled to avoid packetloss.

Current Linux bridge implementations and API can now provide us
with the so far missing information about interested but "remote"
multicast receivers behind bridge ports.

The Linux bridge performs the detection of remote participants
interested in multicast packets with its own and mature so
called IGMP and MLD snooping code and stores that in its
database. With the new API provided by the bridge batman-adv can
now simply hook into this database.

We then reliably announce the gathered multicast listeners to
other nodes through the batman-adv translation table.

Additionally, the Linux bridge provides us with the information about
whether an IGMP/MLD querier exists. If there is none then we need to
disable multicast optimizations as we cannot learn about multicast
listeners on external, bridged-in host then.

Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Markus Pargmann
1f8dce4992 batman-adv: split tvlv into a separate file
The tvlv functionality in main.c is mostly unrelated to the rest of the
content. It still takes up a large portion of this source file (~45%, 588
lines). Moving it to a separate file makes it better visible as a main
component of the batman-adv implementation and hides it less in the other
helper functions in main.c

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
[sven@narfation.org: fix conflicts with current version, fix includes,
rewrote commit message]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Linus Lüssing
bd2a979e53 batman-adv: Always flood IGMP/MLD reports
With this patch IGMP or MLD reports are always flooded. This is
necessary for the upcoming bridge integration to function without
multicast packet loss.

With the report handling so far bridges might miss interested multicast
listeners, leading to wrongly excluding ports from multicast packet
forwarding.

Currently we are treating IGMP/MLD reports, the messages bridges use to
learn about interested multicast listeners, just as any other multicast
packet: We try to send them to nodes matching its multicast destination.

Unfortunately, the destination address of reports of the older
IGMPv2/MLDv1 protocol families do not strictly adhere to their own
protocol: More precisely, the interested receiver, an IGMPv2 or MLDv1
querier, itself usually does not listen to the multicast destination
address of any reports.

Therefore with this patch we are simply excluding IGMP/MLD reports from
the multicast forwarding code path and keep flooding them. By that
any bridge receives them and can properly learn about listeners.

To avoid compatibility issues with older nodes not yet implementing this
report handling, we need to force them to flood reports: We do this by
bumping the multicast TVLV version to 2, effectively disabling their
multicast optimization.

Tested-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Sven Eckelmann
fcafa5e74b batman-adv: Keep includes ordered by filename
It is easier to detect if a include is already there for a used
functionality when the includes are ordered. Using an alphabetic order
together with the grouping in commit 1e2c2a4fe4 ("batman-adv: Add
required includes to all files") makes includes better manageable.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Andrew Lunn
c0f25c802b batman-adv: Include frame priority in fragment header
Unfragmented frames which traverse a node have their skb->priority set
by looking at the IP ToS byte, or the 802.1p header. However for
fragments this is not possible, only one of the fragments will contain
the headers. Instead, place the priority into the fragment header and
on receiving a fragment, use this information to set the skb->priority
for when the fragment is forwarded.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00
Sven Eckelmann
67a5613ed0 batman-adv: Include main.h in bat_v_ogm.h
main.h includes statements which (re)define preprocessor variables which
influence the compiled code. This makes it necessary to include it in all
files. For example, it redefines pr_fmt used to the module as prefix for
each pr_* message.

Reported-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2016-06-30 10:29:43 +02:00