Commit Graph

44719 Commits

Author SHA1 Message Date
Pablo Neira Ayuso
633c9a840d netfilter: nfnetlink: avoid recurrent netns lookups in call_batch
Pass the net pointer to the call_batch callback functions so we can skip
recurrent lookups.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
2015-12-10 13:49:24 +01:00
Johannes Berg
b7bb110008 rfkill: copy the name into the rfkill struct
Some users of rfkill, like NFC and cfg80211, use a dynamic name when
allocating rfkill, in those cases dev_name(). Therefore, the pointer
passed to rfkill_alloc() might not be valid forever, I specifically
found the case that the rfkill name was quite obviously an invalid
pointer (or at least garbage) when the wiphy had been renamed.

Fix this by making a copy of the rfkill name in rfkill_alloc().

Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-10 10:37:51 +01:00
Alexander Aring
b1815fd949 6lowpan: add debugfs support
This patch will introduce a 6lowpan entry into the debugfs if enabled.
Inside this 6lowpan directory we create a subdirectories of all 6lowpan
interfaces to offer a per interface debugfs support.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Alexander Aring
00f5931411 6lowpan: add lowpan dev register helpers
This patch introduces register and unregister functionality for lowpan
interfaces. While register a lowpan interface there are several things
which need to be initialize by the 6lowpan subsystem. Upcoming
functionality need to register/unregister per interface components e.g.
debugfs entry.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Stefan Schmidt
43f26e17d0 6lowpan: add nhc module for GHC routing extension header detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Stefan Schmidt
2f4799478c 6lowpan: add nhc module for GHC fragmentation extension header detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Stefan Schmidt
20616a5a1e 6lowpan: add nhc module for GHC destination extension header detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Stefan Schmidt
c39da3bb5b 6lowpan: add nhc module for GHC ICMPv6 detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:24 +01:00
Stefan Schmidt
70cc86752e 6lowpan: add nhc module for GHC UDP detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:24 +01:00
Stefan Schmidt
7e568f50c1 6lowpan: add nhc module for GHC hop-by-hopextension header detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:24 +01:00
Stefan Schmidt
5e5c08cbee 6lowpan: clarify Kconfig entries for upcoming GHC support
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:24 +01:00
Yichen Zhao
1a11ec89db Bluetooth: Fix locking in bt_accept_dequeue after disconnection
Fix a crash that may happen when bt_accept_dequeue is run after a
Bluetooth connection has been disconnected. bt_accept_unlink was called
after release_sock, permitting bt_accept_unlink to run twice on the same
socket and cause a NULL pointer dereference.

[50510.241632] BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
[50510.241694] IP: [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
[50510.241759] PGD 0
[50510.241776] Oops: 0002 [#1] SMP
[50510.241802] Modules linked in: rtl8192cu rtl_usb rtlwifi rtl8192c_common 8021q garp stp mrp llc rfcomm bnep nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp arc4 ath9k ath9k_common ath9k_hw ath kvm eeepc_wmi asus_wmi mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek sparse_keymap crct10dif_pclmul snd_hda_codec_generic crc32_pclmul snd_hda_intel snd_hda_controller cfg80211 snd_hda_codec i915 snd_hwdep snd_pcm ghash_clmulni_intel snd_timer snd soundcore serio_raw cryptd drm_kms_helper drm i2c_algo_bit shpchp ath3k mei_me lpc_ich btusb bluetooth 6lowpan_iphc mei lp parport wmi video mac_hid psmouse ahci libahci r8169 mii
[50510.242279] CPU: 0 PID: 934 Comm: krfcommd Not tainted 3.16.0-49-generic #65~14.04.1-Ubuntu
[50510.242327] Hardware name: ASUSTeK Computer INC. VM40B/VM40B, BIOS 1501 12/09/2014
[50510.242370] task: ffff8800d9068a30 ti: ffff8800d7a54000 task.ti: ffff8800d7a54000
[50510.242413] RIP: 0010:[<ffffffffc01243f7>]  [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
[50510.242480] RSP: 0018:ffff8800d7a57d58  EFLAGS: 00010246
[50510.242511] RAX: 0000000000000000 RBX: ffff880119bb8c00 RCX: ffff880119bb8eb0
[50510.242552] RDX: ffff880119bb8eb0 RSI: 00000000fffffe01 RDI: ffff880119bb8c00
[50510.242592] RBP: ffff8800d7a57d60 R08: 0000000000000283 R09: 0000000000000001
[50510.242633] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800d8da9eb0
[50510.242673] R13: ffff8800d74fdb80 R14: ffff880119bb8c00 R15: ffff8800d8da9c00
[50510.242715] FS:  0000000000000000(0000) GS:ffff88011fa00000(0000) knlGS:0000000000000000
[50510.242761] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[50510.242794] CR2: 00000000000001a8 CR3: 0000000001c13000 CR4: 00000000001407f0
[50510.242835] Stack:
[50510.242849]  ffff880119bb8eb0 ffff8800d7a57da0 ffffffffc0124506 ffff8800d8da9eb0
[50510.242899]  ffff8800d8da9c00 ffff8800d9068a30 0000000000000000 ffff8800d74fdb80
[50510.242949]  ffff8800d6f85208 ffff8800d7a57e08 ffffffffc0159985 000000000000001f
[50510.242999] Call Trace:
[50510.243027]  [<ffffffffc0124506>] bt_accept_dequeue+0xb6/0x180 [bluetooth]
[50510.243085]  [<ffffffffc0159985>] l2cap_sock_accept+0x125/0x220 [bluetooth]
[50510.243128]  [<ffffffff810a1b30>] ? wake_up_state+0x20/0x20
[50510.243163]  [<ffffffff8164946e>] kernel_accept+0x4e/0xa0
[50510.243200]  [<ffffffffc05b97cd>] rfcomm_run+0x1ad/0x890 [rfcomm]
[50510.243238]  [<ffffffffc05b9620>] ? rfcomm_process_rx+0x8a0/0x8a0 [rfcomm]
[50510.243281]  [<ffffffff81091572>] kthread+0xd2/0xf0
[50510.243312]  [<ffffffff810914a0>] ? kthread_create_on_node+0x1c0/0x1c0
[50510.243353]  [<ffffffff8176e9d8>] ret_from_fork+0x58/0x90
[50510.243387]  [<ffffffff810914a0>] ? kthread_create_on_node+0x1c0/0x1c0
[50510.243424] Code: 00 48 8b 93 b8 02 00 00 48 8d 83 b0 02 00 00 48 89 51 08 48 89 0a 48 89 83 b0 02 00 00 48 89 83 b8 02 00 00 48 8b 83 c0 02 00 00 <66> 83 a8 a8 01 00 00 01 48 c7 83 c0 02 00 00 00 00 00 00 f0 ff
[50510.243685] RIP  [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
[50510.243737]  RSP <ffff8800d7a57d58>
[50510.243758] CR2: 00000000000001a8
[50510.249457] ---[ end trace bb984f932c4e3ab3 ]---

Signed-off-by: Yichen Zhao <zhaoyichen@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:51 +01:00
Johan Hedberg
acb9f911ea Bluetooth: Don't treat connection timeout as a failure
When we're doing background scanning and connection attempts it's
possible we timeout trying to connect and go back to scanning again.
The timeout triggers a HCI_LE_Create_Connection_Cancel which will
trigger a Connection Complete with "Unknown Connection Identifier"
error status. Since we go back to scanning this isn't really a failure
and shouldn't be presented as such to user space through mgmt.

The exception to this is if the connection attempt was due to an
explicit request on an L2CAP socket (indicated by
params->explicit_connect being true). Since the socket will get an
error it's consistent to also notify the failure on mgmt in this case.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:51 +01:00
Johan Hedberg
2f99536a5b Bluetooth: Use continuous scanning when creating LE connections
All LE connections are now triggered through a preceding passive scan
and waiting for a connectable advertising report. This means we've got
the best possible guarantee that the device is within range and should
be able to request the controller to perform continuous scanning. This
way we minimize the risk that we miss out on any advertising packets.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org # 4.3+
2015-12-10 00:51:51 +01:00
Johan Hedberg
cab054ab47 Bluetooth: Clean up current advertising instance tracking
We can simplify a lot of code by making sure hdev->cur_adv_instance is
always up-to-date. This allows e.g. the removal of the
get_current_adv_instance() helper function and the special
HCI_ADV_CURRENT value. This patch also makes selecting instance 0x00
explicit in the various calls where advertising instances aren't
enabled, e.g. when HCI_ADVERTISING is set or we've just finished
enabling LE.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:50 +01:00
Johan Hedberg
d6b7e2cddb Bluetooth: Clean up advertising initialization in powered_update_hci()
The logic in powered_update_hci() to initialize the advertising data &
state is a bit more complicated than it needs to be. It was previously
not doing anything if HCI_LE_ENABLED wasn't set, but this was not
obvious by quickly looking at the code. Now the conditions for the
various actions are more explicit. Another simplification is due to
the fact that __hci_req_schedule_adv_instance() takes care of setting
hdev->cur_adv_instance so there's no need to set it before calling the
function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:50 +01:00
Johan Hedberg
550a8ca765 Bluetooth: Remove redundant check for req.cmd_q
The hci_req_run() function already checks for empty cmd_q and bails
out if necessary. Also, req.cmd_q should really be treated as private
data of the request and not accessed directly.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
d6dac32e84 Bluetooth: Fix updating wrong instance's scan_rsp data
The __hci_req_update_scan_rsp_data gets the instance to be updated
which should get passed to update_inst_scan_rsp_data() instead of
always enabling the current instance.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
17fd08ffb5 Bluetooth: Remove unnecessary HCI_ADVERTISING_INSTANCE flag
This flag just tells us whether hdev->adv_instances is empty or not.
We can equally well use the list_empty() function to get this
information.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
02c04afea9 Bluetooth: Simplify read_adv_features code
The code in the Read Advertising Features mgmt command handler is
unnecessarily complicated. Clean it up and remove unnecessary
variables & branches.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
2ff13894cf Bluetooth: Perform HCI update for power on synchronously
The request to update HCI during power on is always coming either from
hdev->req_workqueue or through an ioctl, so it's safe to use
hci_req_sync for it. This way we also eliminate potential races with
incoming mgmt commands or other actions while powering on.

Part of this refactoring is the splitting of mgmt_powered() into
mgmt_power_on() and __mgmt_power_off() functions. The main reason is
the different requirements as far as hdev locking is concerned, as
highlighted with the __ prefix of the power off API.

Since the power on in the case of clearing the AUTO_OFF flag cannot be
done synchronously in the set_powered mgmt handler, the hci_power_on
work callback is extended to cover this (which also simplifies the
set_powered helper a lot).

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
bf943cbf76 Bluetooth: Move fast connectable code to hci_request.c
We'll soon need this both in hci_request.c and mgmt.c so move it to
hci_request.c as a generic helper.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
b1a8917c9b Bluetooth: Move EIR update to hci_request.c
We'll soon need to update the EIR both from hci_request.c and mgmt.c
so move update_eir() as a more generic request helper to
hci_request.c.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
00cf5040b3 Bluetooth: HCI name update to hci_request.c
We'll soon need this both from hci_request.c and mgmt.c so move it as
a request helper function to hci_request.c.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
c366f555b8 Bluetooth: Move discoverable timeout behind hdev->req_workqueue
Since the other discoverable changes are behind req_workqueue now it
only makes sense to move the discoverable timeout there as well.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
aed1a8851d Bluetooth: Move discoverable changes to hdev->req_workqueue
The discoverable mode is intrinsically linked with the connectable
mode e.g. through sharing the same HCI command (Write Scan Enable) for
BR/EDR. It makes therefore sense to move it to hci_request.c and run
the changes through the same hdev->req_workqueue.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
14bf5eac7a Bluetooth: Perform Class of Device changes through hdev->req_workqueue
The Class of Device needs to be changed e.g. for limited discoverable
mode. In preparation of moving the discoverable mode to hci_request.c
and hdev->req_workqueue, move the Class of Device helpers there first.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
53c0ba7451 Bluetooth: Move connectable changes to hdev->req_workqueue
This way the connectable changes are synchronized against each other,
which helps avoid potential races. The connectable mode is also linked
together with LE advertising which makes is more convenient to have it
behind the same workqueue.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
f22525700b Bluetooth: Move advertising instance management to hci_request.c
This paves the way for eventually performing advertising changes
through the hdev->req_workqueue. Some new APIs need to be exposed from
mgmt.c to hci_request.c and vice-versa, but many of them will go away
once hdev->req_workqueue gets used.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:47 +01:00
Johan Hedberg
196a5e97d1 Bluetooth: Move __hci_update_background_scan up in hci_request.c
This way we avoid the need to do a forward declaration in later
patches.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:47 +01:00
Johan Hedberg
01b1cb87d3 Bluetooth: Run page scan updates through hdev->req_workqueue
Since Add/Remove Device perform the page scan updates independently
from the HCI command completion we've introduced a potential race when
multiple mgmt commands are queued. Doing the page scan updates through
the req_workqueue ensures that the state changes are performed in a
race-free manner.

At the same time, to make the request helper more widely usable,
extend it to also cover Inquiry Scan changes since those are behind
the same HCI command. This is also reflected in the new name of the
API as well as the work struct name.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:47 +01:00
Florian Westphal
9fb0b519c7 netfilter: nf_tables: fix nf_log_trace based tracing
nf_log_trace() outputs bogus 'TRACE:' strings because I forgot to update
the comments array.

Fixes: 33d5a7b14b ("netfilter: nf_tables: extend tracing infrastructure")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 16:53:46 +01:00
Rosen, Rami
23509fcd4e netfilter: nfnetlink_log: Change setter functions to be void
Change return type of nfulnl_set_timeout() and nfulnl_set_qthresh() to
be void.

This patch changes the return type of the static methods
nfulnl_set_timeout() and nfulnl_set_qthresh() to be void, as there is no
justification and no need for these methods to return int.

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 14:52:56 +01:00
Nikolay Borisov
639e077b43 netfilter: nfnetlink_queue: Unregister pernet subsys in case of init failure
Commit 3bfe049807 ("netfilter: nfnetlink_{log,queue}:
Register pernet in first place") reorganised the initialisation
order of the pernet_subsys to avoid "use-before-initialised"
condition. However, in doing so the cleanup logic in nfnetlink_queue
got botched in that the pernet_subsys wasn't cleaned in case
nfnetlink_subsys_register failed. This patch adds the necessary
cleanup routine call.

Fixes: 3bfe049807 ("netfilter: nfnetlink_{log,queue}: Register pernet in first place")
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 14:46:47 +01:00
Florian Westphal
e97ac12859 netfilter: ipv6: nf_defrag: fix NULL deref panic
Valdis reports NULL deref in nf_ct_frag6_gather.
Problem is bogus use of skb_queue_walk() -- we miss first skb in the list
since we start with head->next instead of head.

In case the element we're looking for was head->next we won't find
a result and then trip over NULL iter.

(defrag uses plain NULL-terminated list rather than one terminated by
 head-of-list-pointer, which is what skb_queue_walk expects).

Fixes: 029f7f3b87 ("netfilter: ipv6: nf_defrag: avoid/free clone operations")
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 14:26:31 +01:00
Florian Westphal
e639f7ab07 netfilter: nf_tables: wrap tracing with a static key
Only needed when meta nftrace rule(s) were added.
The assumption is that no such rules are active, so the call to
nft_trace_init is "never" needed.

When nftrace rules are active, we always call the nft_trace_* functions,
but will only send netlink messages when all of the following are true:

 - traceinfo structure was initialised
 - skb->nf_trace == 1
 - at least one subscriber to trace group.

Adding an extra conditional
(static_branch ... && skb->nf_trace)
	nft_trace_init( ..)

Is possible but results in a larger nft_do_chain footprint.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 13:23:13 +01:00
Florian Westphal
33d5a7b14b netfilter: nf_tables: extend tracing infrastructure
nft monitor mode can then decode and display this trace data.

Parts of LL/Network/Transport headers are provided as separate
attributes.

Otherwise, printing IP address data becomes virtually impossible
for userspace since in the case of the netdev family we really don't
want userspace to have to know all the possible link layer types
and/or sizes just to display/print an ip address.

We also don't want userspace to have to follow ipv6 header chains
to get the s/dport info, the kernel already did this work for us.

To avoid bloating nft_do_chain all data required for tracing is
encapsulated in nft_traceinfo.

The structure is initialized unconditionally(!) for each nft_do_chain
invocation.

This unconditionall call will be moved under a static key in a
followup patch.

With lots of help from Patrick McHardy and Pablo Neira.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 13:18:37 +01:00
Tejun Heo
bd1060a1d6 sock, cgroup: add sock->sk_cgroup
In cgroup v1, dealing with cgroup membership was difficult because the
number of membership associations was unbound.  As a result, cgroup v1
grew several controllers whose primary purpose is either tagging
membership or pull in configuration knobs from other subsystems so
that cgroup membership test can be avoided.

net_cls and net_prio controllers are examples of the latter.  They
allow configuring network-specific attributes from cgroup side so that
network subsystem can avoid testing cgroup membership; unfortunately,
these are not only cumbersome but also problematic.

Both net_cls and net_prio aren't properly hierarchical.  Both inherit
configuration from the parent on creation but there's no interaction
afterwards.  An ancestor doesn't restrict the behavior in its subtree
in anyway and configuration changes aren't propagated downwards.
Especially when combined with cgroup delegation, this is problematic
because delegatees can mess up whatever network configuration
implemented at the system level.  net_prio would allow the delegatees
to set whatever priority value regardless of CAP_NET_ADMIN and net_cls
the same for classid.

While it is possible to solve these issues from controller side by
implementing hierarchical allowable ranges in both controllers, it
would involve quite a bit of complexity in the controllers and further
obfuscate network configuration as it becomes even more difficult to
tell what's actually being configured looking from the network side.
While not much can be done for v1 at this point, as membership
handling is sane on cgroup v2, it'd be better to make cgroup matching
behave like other network matches and classifiers than introducing
further complications.

In preparation, this patch updates sock->sk_cgrp_data handling so that
it points to the v2 cgroup that sock was created in until either
net_prio or net_cls is used.  Once either of the two is used,
sock->sk_cgrp_data reverts to its previous role of carrying prioidx
and classid.  This is to avoid adding yet another cgroup related field
to struct sock.

As the mode switching can happen at most once per boot, the switching
mechanism is aimed at lowering hot path overhead.  It may leak a
finite, likely small, number of cgroup refs and report spurious
prioidx or classid on switching; however, dynamic updates of prioidx
and classid have always been racy and lossy - socks between creation
and fd installation are never updated, config changes don't update
existing sockets at all, and prioidx may index with dead and recycled
cgroup IDs.  Non-critical inaccuracies from small race windows won't
make any noticeable difference.

This patch doesn't make use of the pointer yet.  The following patch
will implement netfilter match for cgroup2 membership.

v2: Use sock_cgroup_data to avoid inflating struct sock w/ another
    cgroup specific field.

v3: Add comments explaining why sock_data_prioidx() and
    sock_data_classid() use different fallback values.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 22:02:33 -05:00
Tejun Heo
2a56a1fec2 net: wrap sock->sk_cgrp_prioidx and ->sk_classid inside a struct
Introduce sock->sk_cgrp_data which is a struct sock_cgroup_data.
->sk_cgroup_prioidx and ->sk_classid are moved into it.  The struct
and its accessors are defined in cgroup-defs.h.  This is to prepare
for overloading the fields with a cgroup pointer.

This patch mostly performs equivalent conversions but the followings
are noteworthy.

* Equality test before updating classid is removed from
  sock_update_classid().  This shouldn't make any noticeable
  difference and a similar test will be implemented on the helper side
  later.

* sock_update_netprioidx() now takes struct sock_cgroup_data and can
  be moved to netprio_cgroup.h without causing include dependency
  loop.  Moved.

* The dummy version of sock_update_netprioidx() converted to a static
  inline function while at it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 22:02:33 -05:00
Tejun Heo
297dbde19c netprio_cgroup: limit the maximum css->id to USHRT_MAX
netprio builds per-netdev contiguous priomap array which is indexed by
css->id.  The array is allocated using kzalloc() effectively limiting
the maximum ID supported to some thousand range.  This patch caps the
maximum supported css->id to USHRT_MAX which should be way above what
is actually useable.

This allows reducing sock->sk_cgrp_prioidx to u16 from u32.  The freed
up part will be used to overload the cgroup related fields.
sock->sk_cgrp_prioidx's position is swapped with sk_mark so that the
two cgroup related fields are adjacent.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 22:02:33 -05:00
Stefan Hajnoczi
8ac2837c89 Revert "Merge branch 'vsock-virtio'"
This reverts commit 0d76d6e8b2 and merge
commit c402293bd7, reversing changes made
to c89359a42e.

The virtio-vsock device specification is not finalized yet.  Michael
Tsirkin voiced concerned about merging this code when the hardware
interface (and possibly the userspace interface) could still change.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 21:55:49 -05:00
Rainer Weikusat
760a432247 net: Fix inverted test in __skb_recv_datagram
As the kernel generally uses negated error numbers, *err needs to be
compared with -EAGAIN (d'oh).

Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Fixes: ea3793ee29 ("core: enable more fine-grained datagram reception control")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 11:30:17 -05:00
Eric Dumazet
bd5eb35f16 xfrm: take care of request sockets
TCP SYNACK messages might now be attached to request sockets.

XFRM needs to get back to a listener socket.

Adds new helpers that might be used elsewhere :
sk_to_full_sk() and sk_const_to_full_sk()

Note: We also need to add RCU protection for xfrm lookups,
now TCP/DCCP have lockless listener processing. This will
be addressed in separate patches.

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 17:07:33 -05:00
Eric Dumazet
69ce6487dc ipv6: sctp: fix lockdep splat in sctp_v6_get_dst()
While cooking the sctp np->opt rcu fixes, I forgot to move
one rcu_read_unlock() after the added rcu_dereference() in
sctp_v6_get_dst()

This gave lockdep warnings reported by Dave Jones.

Fixes: c836a8ba93 ("ipv6: sctp: add rcu protection around np->opt")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 17:07:33 -05:00
David S. Miller
0c9cd7c433 Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Antonio Quartulli says:

====================
Included changes:
- prevent compatibility issue between DAT and speedy join from creating
  inconsistencies in the global translation table
- make sure temporary TT entries are purged out if not claimed
- fix comparison function used for TT hash table
- fix invalid stack access in batadv_dat_select_candidates()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:59:19 -05:00
Neil Armstrong
cda5c15b23 net: dsa: move dsa slave destroy code to slave.c
Move dsa slave dedicated code from dsa_switch_destroy to a new
dsa_slave_destroy function in slave.c.
Add the netif_carrier_off and phy_disconnect calls in order to
correctly cleanup the netdev state and PHY state machine.

Signed-off-by: Frode Isaksen <fisaksen@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:35:50 -05:00
Neil Armstrong
679fb46c57 net: dsa: Add missing master netdev dev_put() calls
Upon probe failure or unbinding, add missing dev_put() calls.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:35:50 -05:00
Neil Armstrong
b0dc635d92 net: dsa: cleanup resources upon module removal
Make sure that we unassign the master_netdev dsa_ptr to make the packet
processing go through the regular Ethernet receive path.

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:35:50 -05:00
Neil Armstrong
4baee937b8 net: dsa: remove DSA link polling
Since no more DSA driver uses the polling callback, and since
the phylib handles the link detection, remove the link polling
work and timer code.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:35:49 -05:00
Robert Shearman
fe82b3300e mpls: fix sending of local encapped packets
Locally generated IPv4 and (probably) IPv6 packets are dropped because
skb->protocol isn't set. We could write wrappers to lwtunnel_output
for IPv4 and IPv6 that set the protocol accordingly and then call
lwtunnel_output, but mpls_output relies on the AF-specific type of dst
anyway to get the via address.

Therefore, make use of dst->dst_ops->family in mpls_output to
determine the type of nexthop and thus protocol of the packet instead
of checking skb->protocol.

Fixes: 61adedf3e3 ("route: move lwtunnel state to dst_entry")
Reported-by: Sam Russell <sam.h.russell@gmail.com>
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:32:47 -05:00