inetpeer: get rid of ip_id_count
Ideally, we would need to generate IP ID using a per destination IP generator. linux kernels used inet_peer cache for this purpose, but this had a huge cost on servers disabling MTU discovery. 1) each inet_peer struct consumes 192 bytes 2) inetpeer cache uses a binary tree of inet_peer structs, with a nominal size of ~66000 elements under load. 3) lookups in this tree are hitting a lot of cache lines, as tree depth is about 20. 4) If server deals with many tcp flows, we have a high probability of not finding the inet_peer, allocating a fresh one, inserting it in the tree with same initial ip_id_count, (cf secure_ip_id()) 5) We garbage collect inet_peer aggressively. IP ID generation do not have to be 'perfect' Goal is trying to avoid duplicates in a short period of time, so that reassembly units have a chance to complete reassembly of fragments belonging to one message before receiving other fragments with a recycled ID. We simply use an array of generators, and a Jenkin hash using the dst IP as a key. ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it belongs (it is only used from this file) secure_ip_id() and secure_ipv6_id() no longer are needed. Rename ip_select_ident_more() to ip_select_ident_segs() to avoid unnecessary decrement/increment of the number of segments. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:

committed by
David S. Miller

parent
e067ee336a
commit
73f156a6e8
@@ -26,20 +26,7 @@
|
||||
* Theory of operations.
|
||||
* We keep one entry for each peer IP address. The nodes contains long-living
|
||||
* information about the peer which doesn't depend on routes.
|
||||
* At this moment this information consists only of ID field for the next
|
||||
* outgoing IP packet. This field is incremented with each packet as encoded
|
||||
* in inet_getid() function (include/net/inetpeer.h).
|
||||
* At the moment of writing this notes identifier of IP packets is generated
|
||||
* to be unpredictable using this code only for packets subjected
|
||||
* (actually or potentially) to defragmentation. I.e. DF packets less than
|
||||
* PMTU in size when local fragmentation is disabled use a constant ID and do
|
||||
* not use this code (see ip_select_ident() in include/net/ip.h).
|
||||
*
|
||||
* Route cache entries hold references to our nodes.
|
||||
* New cache entries get references via lookup by destination IP address in
|
||||
* the avl tree. The reference is grabbed only when it's needed i.e. only
|
||||
* when we try to output IP packet which needs an unpredictable ID (see
|
||||
* __ip_select_ident() in net/ipv4/route.c).
|
||||
* Nodes are removed only when reference counter goes to 0.
|
||||
* When it's happened the node may be removed when a sufficient amount of
|
||||
* time has been passed since its last use. The less-recently-used entry can
|
||||
@@ -62,7 +49,6 @@
|
||||
* refcnt: atomically against modifications on other CPU;
|
||||
* usually under some other lock to prevent node disappearing
|
||||
* daddr: unchangeable
|
||||
* ip_id_count: atomic value (no lock needed)
|
||||
*/
|
||||
|
||||
static struct kmem_cache *peer_cachep __read_mostly;
|
||||
@@ -497,10 +483,6 @@ relookup:
|
||||
p->daddr = *daddr;
|
||||
atomic_set(&p->refcnt, 1);
|
||||
atomic_set(&p->rid, 0);
|
||||
atomic_set(&p->ip_id_count,
|
||||
(daddr->family == AF_INET) ?
|
||||
secure_ip_id(daddr->addr.a4) :
|
||||
secure_ipv6_id(daddr->addr.a6));
|
||||
p->metrics[RTAX_LOCK-1] = INETPEER_METRICS_NEW;
|
||||
p->rate_tokens = 0;
|
||||
/* 60*HZ is arbitrary, but chosen enough high so that the first
|
||||
|
Reference in New Issue
Block a user