tcp: allow for bigger reordering level

While testing upcoming Yaogong patch (converting out of order queue
into an RB tree), I hit the max reordering level of linux TCP stack.

Reordering level was limited to 127 for no good reason, and some
network setups [1] can easily reach this limit and get limited
throughput.

Allow a new max limit of 300, and add a sysctl to allow admins to even
allow bigger (or lower) values if needed.

[1] Aggregation of links, per packet load balancing, fabrics not doing
 deep packet inspections, alternative TCP congestion modules...

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yaogong Wang <wygivan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Eric Dumazet
2014-10-27 21:45:24 -07:00
committed by David S. Miller
parent 7aef06db0f
commit dca145ffaa
6 changed files with 23 additions and 12 deletions

View File

@@ -2230,11 +2230,8 @@ balance-rr: This mode is the only mode that will permit a single
It is possible to adjust TCP/IP's congestion limits by
altering the net.ipv4.tcp_reordering sysctl parameter. The
usual default value is 3, and the maximum useful value is 127.
For a four interface balance-rr bond, expect that a single
TCP/IP stream will utilize no more than approximately 2.3
interface's worth of throughput, even after adjusting
tcp_reordering.
usual default value is 3. But keep in mind TCP stack is able
to automatically increase this when it detects reorders.
Note that the fraction of packets that will be delivered out of
order is highly variable, and is unlikely to be zero. The level

View File

@@ -376,9 +376,17 @@ tcp_orphan_retries - INTEGER
may consume significant resources. Cf. tcp_max_orphans.
tcp_reordering - INTEGER
Maximal reordering of packets in a TCP stream.
Initial reordering level of packets in a TCP stream.
TCP stack can then dynamically adjust flow reordering level
between this initial value and tcp_max_reordering
Default: 3
tcp_max_reordering - INTEGER
Maximal reordering level of packets in a TCP stream.
300 is a fairly conservative value, but you might increase it
if paths are using per packet load balancing (like bonding rr mode)
Default: 300
tcp_retrans_collapse - BOOLEAN
Bug-to-bug compatibility with some broken printers.
On retransmit try to send bigger packets to work around bugs in