net: percpu net_device refcount

We tried very hard to remove all possible dev_hold()/dev_put() pairs in
network stack, using RCU conversions.

There is still an unavoidable device refcount change for every dst we
create/destroy, and this can slow down some workloads (routers or some
app servers, mmap af_packet)

We can switch to a percpu refcount implementation, now dynamic per_cpu
infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
per device.

On x86, dev_hold(dev) code :

before
        lock    incl 0x280(%ebx)
after:
        movl    0x260(%ebx),%eax
        incl    fs:(%eax)

Stress bench :

(Sending 160.000.000 UDP frames,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_TRIE)

Before:

real    1m1.662s
user    0m14.373s
sys     12m55.960s

After:

real    0m51.179s
user    0m15.329s
sys     10m15.942s

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Eric Dumazet
2010-10-11 10:22:12 +00:00
committed by David S. Miller
parent f0b9f47251
commit 29b4433d99
4 changed files with 41 additions and 14 deletions

View File

@@ -2701,7 +2701,7 @@ static int nes_disconnect(struct nes_qp *nesqp, int abrupt)
nesibdev = nesvnic->nesibdev;
nes_debug(NES_DBG_CM, "netdev refcnt = %u.\n",
atomic_read(&nesvnic->netdev->refcnt));
netdev_refcnt_read(nesvnic->netdev));
if (nesqp->active_conn) {
@@ -2791,7 +2791,7 @@ int nes_accept(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
atomic_inc(&cm_accepts);
nes_debug(NES_DBG_CM, "netdev refcnt = %u.\n",
atomic_read(&nesvnic->netdev->refcnt));
netdev_refcnt_read(nesvnic->netdev));
/* allocate the ietf frame and space for private data */
nesqp->ietf_frame = pci_alloc_consistent(nesdev->pcidev,

View File

@@ -785,7 +785,7 @@ static struct ib_pd *nes_alloc_pd(struct ib_device *ibdev,
nes_debug(NES_DBG_PD, "nesvnic=%p, netdev=%p %s, ibdev=%p, context=%p, netdev refcnt=%u\n",
nesvnic, nesdev->netdev[0], nesdev->netdev[0]->name, ibdev, context,
atomic_read(&nesvnic->netdev->refcnt));
netdev_refcnt_read(nesvnic->netdev));
err = nes_alloc_resource(nesadapter, nesadapter->allocated_pds,
nesadapter->max_pd, &pd_num, &nesadapter->next_pd);
@@ -1416,7 +1416,7 @@ static struct ib_qp *nes_create_qp(struct ib_pd *ibpd,
/* update the QP table */
nesdev->nesadapter->qp_table[nesqp->hwqp.qp_id-NES_FIRST_QPN] = nesqp;
nes_debug(NES_DBG_QP, "netdev refcnt=%u\n",
atomic_read(&nesvnic->netdev->refcnt));
netdev_refcnt_read(nesvnic->netdev));
return &nesqp->ibqp;
}