inetpeer: get rid of ip_id_count

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Eric Dumazet
2014-06-02 05:26:03 -07:00
committed by David S. Miller
parent e067ee336a
commit 73f156a6e8
17 changed files with 66 additions and 156 deletions

View File

@@ -41,14 +41,13 @@ struct inet_peer {
struct rcu_head gc_rcu;
};
/*
* Once inet_peer is queued for deletion (refcnt == -1), following fields
* are not available: rid, ip_id_count
* Once inet_peer is queued for deletion (refcnt == -1), following field
* is not available: rid
* We can share memory with rcu_head to help keep inet_peer small.
*/
union {
struct {
atomic_t rid; /* Frag reception counter */
atomic_t ip_id_count; /* IP ID for the next packet */
};
struct rcu_head rcu;
struct inet_peer *gc_next;
@@ -165,7 +164,7 @@ bool inet_peer_xrlim_allow(struct inet_peer *peer, int timeout);
void inetpeer_invalidate_tree(struct inet_peer_base *);
/*
* temporary check to make sure we dont access rid, ip_id_count, tcp_ts,
* temporary check to make sure we dont access rid, tcp_ts,
* tcp_ts_stamp if no refcount is taken on inet_peer
*/
static inline void inet_peer_refcheck(const struct inet_peer *p)
@@ -173,20 +172,4 @@ static inline void inet_peer_refcheck(const struct inet_peer *p)
WARN_ON_ONCE(atomic_read(&p->refcnt) <= 0);
}
/* can be called with or without local BH being disabled */
static inline int inet_getid(struct inet_peer *p, int more)
{
int old, new;
more++;
inet_peer_refcheck(p);
do {
old = atomic_read(&p->ip_id_count);
new = old + more;
if (!new)
new = 1;
} while (atomic_cmpxchg(&p->ip_id_count, old, new) != old);
return new;
}
#endif /* _NET_INETPEER_H */