The mask option allows you put all address belonging that mask into
the same recent slot. This can be useful in case that recent is used
to detect attacks from the same network segment.
Tested for backward compatibility.
Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
because reply packets need to go to the same nfqueue, src/dst ip
address were xor'd prior to jhash().
However, this causes bad distribution for some workloads, e.g.
flows a.b.1.{1,n} -> a.b.2.{1,n} all share the same hash value.
Avoid this by hashing both. To get same hash for replies,
first argument is the smaller address.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Since the sysctl data for l[3|4]proto now resides in pernet nf_proto_net.
We can now remove this unused fields from struct nf_contrack_l[3,4]proto.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch modifies the GRE protocol tracker, which partially
supported namespace before this patch, to use the new namespace
infrastructure for nf_conntrack.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch modifies the DCCP protocol tracker to use the new
namespace infrastructure for nf_conntrack.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch prepares the namespace support for layer 3 protocol trackers.
Basically, this modifies the following interfaces:
* nf_ct_l3proto_[un]register_sysctl.
* nf_conntrack_l3proto_[un]register.
We add a new nf_ct_l3proto_net is used to get the pernet data of l3proto.
This adds rhe new struct nf_ip_net that is used to store the sysctl header
and l3proto_ipv4,l4proto_tcp(6),l4proto_udp(6),l4proto_icmp(v6) because the
protos such tcp and tcp6 use the same data,so making nf_ip_net as a field
of netns_ct is the easiest way to manager it.
This patch also adds init_net to struct nf_conntrack_l3proto to initial
the layer 3 protocol pernet data.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch prepares the namespace support for layer 4 protocol trackers.
Basically, this modifies the following interfaces:
* nf_ct_[un]register_sysctl
* nf_conntrack_l4proto_[un]register
to include the namespace parameter. We still use init_net in this patch
to prepare the ground for follow-up patches for each layer 4 protocol
tracker.
We add a new net_id field to struct nf_conntrack_l4proto that is used
to store the pernet_operations id for each layer 4 protocol tracker.
Note that AF_INET6's protocols do not need to do sysctl compat. Thus,
we only register compat sysctl when l4proto.l3proto != AF_INET6.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Implement a new "fail-open" mode where packets are not dropped
upon queue-full condition. This mode can be enabled/disabled per
queue using netlink NFQA_CFG_FLAGS & NFQA_CFG_MASK attributes.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Vivek Kashyap <vivk@us.ibm.com>
Signed-off-by: Sridhar Samudrala <samudrala@us.ibm.com>
The nat_rtp_rtcp hook takes two separate parameters port and rtp_port.
port is expected to be the real h245 address (found inside the packet).
rtp_port is the even number closest to port (RTP ports are even and
RTCP ports are odd).
However currently, both port and rtp_port are having same value (both are
rounded to nearest even numbers).
This works well in case of openlogicalchannel with media (RTP/even) port.
But in case of openlogicalchannel for media control (RTCP/odd) port,
h245 address in the packet is wrongly modified to have an even port.
I am attaching a pcap demonstrating the problem, for any further analysis.
This behavior was introduced around v2.6.19 while rewriting the helper.
Signed-off-by: Jagdish Motwani <jagdish.motwani@elitecore.com>
Signed-off-by: Sanket Shah <sanket.shah@elitecore.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch addresses two issues:
a) Fix usage of u32 and __be32 that causes endianess warnings via sparse.
b) Ensure consistent hashing in a cluster that is composed of big and
little endian systems. Thus, we obtain the same hash mark in an
heterogeneous cluster.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
John Linville says:
====================
Amitkumar Karwar gives us a cfg80211 fix that changes some state
tracking in order to avoid a WARNING.
Arik Nemtsov provide a mac80211 fix for an RCU-related race.
Avinash Patil shares a pair of mwifiex fixes, one which invalidates
some stale configuration data before a channel change and another to
restrict hidden SSID support to zero-length SSIDs only.
Chun-Yeow Yeoh brings a mac80211 fix for a mesh problem triggered
when combining multiple mesh networks into one.
Felix Fietkau provides a mac80211 lockdep fix.
Joe Perches fixes a couple of thinkos related to bitwise operations.
Johannes Berg comes through with a flurry of fixes. The iwlwifi ones
address a problem Linus recently reported, and some of the fallout
discovered while fixing it. The mac80211 fix properly cleans-up
remain-on-channel work on an interface that is stopped. The others
are clean-ups for regressions caused by stricter checking of possible
virtual interfaces supported by wireless drivers.
Meenakshi Venkataraman provides a mac80211 fix for an off-by-one error.
Seth Forshee provides a fix to make the wireless adapters used in
some Mac boxes work after being in S3 power saving state.
Stanislaw Gruszka offers a copule of fixes, a fix for a mac80211
scanning regression and an rt2x00 fix to avoid some lockdep spew.
Last but not least, Vinicius Costa Gomes provides a bluetooth fix
for a typo that "was preventing important features of Bluetooth
from working".
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Redesign all the off-channel code, getting rid of
the generic off-channel work concept, replacing
it with a simple remain-on-channel list.
This fixes a number of small issues with the ROC
implementation:
* offloaded remain-on-channel couldn't be queued,
now we can queue it as well, if needed
* in iwlwifi (the only user) offloaded ROC is
mutually exclusive with scanning, use the new
queue to handle that case -- I expect that it
will later depend on a HW flag
The bigger issue though is that there's a bad bug
in the current implementation: if we get a mgmt
TX request while HW roc is active, and this new
request has a wait time, we actually schedule a
software ROC instead since we can't guarantee the
existing offloaded ROC will still be that long.
To fix this, the queuing mechanism was needed.
The queuing mechanism for offloaded ROC isn't yet
optimal, ideally we should add API to have the HW
extend the ROC if needed. We could add that later
but for now use a software implementation.
Overall, this unifies the behaviour between the
offloaded and software-implemented case as much
as possible.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The IDLE handling in HW off-channel is broken right
now since we turn off IDLE only when the off-channel
period already started. Therefore, all drivers that
use it today (only iwlwifi!) must support off-channel
while idle, so playing with idle isn't needed at all.
Off-channel in general, since it's no longer used for
authentication/association, shouldn't affect PS, so
also remove that logic.
Also document a small caveat for reporting TX status
from off-channel frames in HW remain-on-channel.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When performing a HW restart for an AP mode interface, add stations back
only after the AP is beaconing. This mimics the normal flow of STA
addition on AP.
Some devices (wlcore) do not support adding stations before beaconing,
so this has the added benefit of making recovery work for them.
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The remain-on-channel time validation shouldn't
depend on the value of HZ, as it does now with
the check against jiffies, since then you might
use a value that works on one system but not on
another. Fix it by checking against a minimum
that's fixed.
Also add validation of the wait duration for a
management frame TX since this also translates
into remain-on-channel internally.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
drv_resume can get called without a prior call to drv_suspend.
Consider the following steps:
1. Suspend is started but driver's drv_suspend returns error.
2. Suspend is aborted. local->wowlan flag is left set.
3. Interface is removed.
4. Suspend again. This time open_count is 0 so drv_suspend is
not called and local->wowlan not cleared.
5. On resume ieee80211_reconfig will call drv_resume since
local->wowlan is set.
Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Make mac80211 print a message when it disables
HT due to the connection using WEP/TKIP or due
to the AP not supporting WMM/QoS.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
For each EDCA TX queue change default settings (in STA mode) to conform
old 802.11b/g channel access rules. This is needed for drivers that do
not have QoS enable/disable "switch" (like rt2x00) to make them work
properly with legacy APs.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
__rfkill_switch_all() switches the state of devices of a given type; however,
it does not switch devices of all type (RFKILL_TYPE_ALL). As a result, it
ignores the keycode "KEY_RFKILL" from another module, i.e. eeepc-wmi.
This fix is to make __rfkill_switch_all() to be able to switch not only
devices of a given type but also all devices.
Signed-off-by: Alex Hung <alex.hung@canonical.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Now that we've removed all uses of the set_channel
API except for the monitor channel and in libertas,
clarify this. Split the libertas mesh use into a
new libertas_set_mesh_channel() operation, just to
keep backward compatibility, and rename the normal
set_channel() to set_monitor_channel().
Also describe the desired set_monitor_channel()
semantics more clearly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
commit 5faa5df1fa (inetpeer: Invalidate the inetpeer tree along with
the routing cache) added a race :
Before freeing an inetpeer, we must respect a RCU grace period, and make
sure no user will attempt to increase refcnt.
inetpeer_invalidate_tree() waits for a RCU grace period before inserting
inetpeer tree into gc_list and waking the worker. At that time, no
concurrent lookup can find a inetpeer in this tree.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ceph_con_revoke_message() is passed both a message and a ceph
connection. A ceph_msg allocated for incoming messages on a
connection always has a pointer to that connection, so there's no
need to provide the connection when revoking such a message.
Note that the existing logic does not preclude the message supplied
being a null/bogus message pointer. The only user of this interface
is the OSD client, and the only value an osd client passes is a
request's r_reply field. That is always non-null (except briefly in
an error path in ceph_osdc_alloc_request(), and that drops the
only reference so the request won't ever have a reply to revoke).
So we can safely assume the passed-in message is non-null, but add a
BUG_ON() to make it very obvious we are imposing this restriction.
Rename the function ceph_msg_revoke_incoming() to reflect that it is
really an operation on an incoming message.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
ceph_con_revoke() is passed both a message and a ceph connection.
Now that any message associated with a connection holds a pointer
to that connection, there's no need to provide the connection when
revoking a message.
This has the added benefit of precluding the possibility of the
providing the wrong connection pointer. If the message's connection
pointer is null, it is not being tracked by any connection, so
revoking it is a no-op. This is supported as a convenience for
upper layers, so they can revoke a message that is not actually
"in flight."
Rename the function ceph_msg_revoke() to reflect that it is really
an operation on a message, not a connection.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
There are essentially two types of ceph messages: incoming and
outgoing. Outgoing messages are always allocated via ceph_msg_new(),
and at the time of their allocation they are not associated with any
particular connection. Incoming messages are always allocated via
ceph_con_in_msg_alloc(), and they are initially associated with the
connection from which incoming data will be placed into the message.
When an outgoing message gets sent, it becomes associated with a
connection and remains that way until the message is successfully
sent. The association of an incoming message goes away at the point
it is sent to an upper layer via a con->ops->dispatch method.
This patch implements reference counting for all ceph messages, such
that every message holds a reference (and a pointer) to a connection
if and only if it is associated with that connection (as described
above).
For background, here is an explanation of the ceph message
lifecycle, emphasizing when an association exists between a message
and a connection.
Outgoing Messages
An outgoing message is "owned" by its allocator, from the time it is
allocated in ceph_msg_new() up to the point it gets queued for
sending in ceph_con_send(). Prior to that point the message's
msg->con pointer is null; at the point it is queued for sending its
message pointer is assigned to refer to the connection. At that
time the message is inserted into a connection's out_queue list.
When a message on the out_queue list has been sent to the socket
layer to be put on the wire, it is transferred out of that list and
into the connection's out_sent list. At that point it is still owned
by the connection, and will remain so until an acknowledgement is
received from the recipient that indicates the message was
successfully transferred. When such an acknowledgement is received
(in process_ack()), the message is removed from its list (in
ceph_msg_remove()), at which point it is no longer associated with
the connection.
So basically, any time a message is on one of a connection's lists,
it is associated with that connection. Reference counting outgoing
messages can thus be done at the points a message is added to the
out_queue (in ceph_con_send()) and the point it is removed from
either its two lists (in ceph_msg_remove())--at which point its
connection pointer becomes null.
Incoming Messages
When an incoming message on a connection is getting read (in
read_partial_message()) and there is no message in con->in_msg,
a new one is allocated using ceph_con_in_msg_alloc(). At that
point the message is associated with the connection. Once that
message has been completely and successfully read, it is passed to
upper layer code using the connection's con->ops->dispatch method.
At that point the association between the message and the connection
no longer exists.
Reference counting of connections for incoming messages can be done
by taking a reference to the connection when the message gets
allocated, and releasing that reference when it gets handed off
using the dispatch method.
We should never fail to get a connection reference for a
message--the since the caller should already hold one.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
When a ceph message is queued for sending it is placed on a list of
pending messages (ceph_connection->out_queue). When they are
actually sent over the wire, they are moved from that list to
another (ceph_connection->out_sent). When acknowledgement for the
message is received, it is removed from the sent messages list.
During that entire time the message is "in the possession" of a
single ceph connection. Keep track of that connection in the
message. This will be used in the next patch (and is a helpful
bit of information for debugging anyway).
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
The function ceph_alloc_msg() is only used to allocate a message
that will be assigned to a connection's in_msg pointer. Rename the
function so this implied usage is more clear.
In addition, make that assignment inside the function (again, since
that's precisely what it's intended to be used for). This allows us
to return what is now provided via the passed-in address of a "skip"
variable. The return type is now Boolean to be explicit that there
are only two possible outcomes.
Make sure the result of an ->alloc_msg method call always sets the
value of *skip properly.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Move the initialization of a ceph connection's private pointer,
operations vector pointer, and peer name information into
ceph_con_init(). Rearrange the arguments so the connection pointer
is first. Hide the byte-swapping of the peer entity number inside
ceph_con_init()
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Hold off initializing a monitor client's connection until just
before it gets opened for use.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
All references to the embedded ceph_connection come from the msgr
workqueue, which is drained prior to mon_client destruction. That
means we can ignore con refcounting entirely.
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Alex Elder <elder@inktank.com>
A monitor client has a pointer to a ceph connection structure in it.
This is the only one of the three ceph client types that do it this
way; the OSD and MDS clients embed the connection into their main
structures. There is always exactly one ceph connection for a
monitor client, so there is no need to allocate it separate from the
monitor client structure.
So switch the ceph_mon_client structure to embed its
ceph_connection structure.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
There were a few direct calls to ceph_con_{get,put}() instead of the con
ops from osd_client.c. This is a bug since those ops aren't defined to
be ceph_con_get/put.
This breaks refcounting on the ceph_osd structs that contain the
ceph_connections, and could lead to all manner of strangeness.
The purpose of the ->get and ->put methods in a ceph connection are
to allow the connection to indicate it has a reference to something
external to the messaging system, *not* to indicate something
external has a reference to the connection.
[elder@inktank.com: added that last sentence]
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Alex Elder <elder@inktank.com>
In ceph_osdc_release_request(), a reference to the r_reply message
is dropped. But just after that, that same message is revoked if it
was in use to receive an incoming reply. Reorder these so we are
sure we hold a reference until we're actually done with the message.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Just like the AP mode patch, instead of setting
the channel and then joining the mesh network,
provide the channel to join the network on to
the join_mesh() function.
Like in AP mode, you can also give the channel
to the join-mesh nl80211 command now.
Unlike AP mode, it picks a default channel if
none was given.
As libertas uses mesh mode interfaces but has
no join_mesh callback and we can't simply break
it, keep some compatibility code for that case
and configure the channel directly for it.
In the non-libertas case, where we store the
channel until join, allow setting it while the
interface is down.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>