net: skb_shared_info optimization

skb_dma_unmap() is quite expensive for small packets,
because we use two different cache lines from skb_shared_info.

One to access nr_frags, one to access dma_maps[0]

Instead of dma_maps being an array of MAX_SKB_FRAGS + 1 elements,
let dma_head alone in a new dma_head field, close to nr_frags,
to reduce cache lines misses.

Tested on my dev machine (bnx2 & tg3 adapters), nice speedup !

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Eric Dumazet
2009-06-05 04:04:16 +00:00
committed by David S. Miller
parent eae3f29cc7
commit 042a53a9e4
10 changed files with 30 additions and 29 deletions

View File

@@ -5487,7 +5487,7 @@ bnx2_run_loopback(struct bnx2 *bp, int loopback_mode)
dev_kfree_skb(skb);
return -EIO;
}
map = skb_shinfo(skb)->dma_maps[0];
map = skb_shinfo(skb)->dma_head;
REG_WR(bp, BNX2_HC_COMMAND,
bp->hc_cmd | BNX2_HC_COMMAND_COAL_NOW_WO_INT);
@@ -6167,7 +6167,7 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev)
}
sp = skb_shinfo(skb);
mapping = sp->dma_maps[0];
mapping = sp->dma_head;
tx_buf = &txr->tx_buf_ring[ring_prod];
tx_buf->skb = skb;
@@ -6191,7 +6191,7 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev)
txbd = &txr->tx_desc_ring[ring_prod];
len = frag->size;
mapping = sp->dma_maps[i + 1];
mapping = sp->dma_maps[i];
txbd->tx_bd_haddr_hi = (u64) mapping >> 32;
txbd->tx_bd_haddr_lo = (u64) mapping & 0xffffffff;