block, bfq: move debug blkio stats behind CONFIG_DEBUG_BLK_CGROUP
BFQ currently creates, and updates, its own instance of the whole set of blkio statistics that cfq creates. Yet, from the comments of Tejun Heo in [1], it turned out that most of these statistics are meant/useful only for debugging. This commit makes BFQ create the latter, debugging statistics only if the option CONFIG_DEBUG_BLK_CGROUP is set. By doing so, this commit also enables BFQ to enjoy a high perfomance boost. The reason is that, if CONFIG_DEBUG_BLK_CGROUP is not set, then BFQ has to update far fewer statistics, and, in particular, not the heaviest to update. To give an idea of the benefits, if CONFIG_DEBUG_BLK_CGROUP is not set, then, on an Intel i7-4850HQ, and with 8 threads doing random I/O in parallel on null_blk (configured with 0 latency), the throughput of BFQ grows from 310 to 400 KIOPS (+30%). We have measured similar or even much higher boosts with other CPUs: e.g., +45% with an ARM CortexTM-A53 Octa-core. Our results have been obtained and can be reproduced very easily with the script in [1]. [1] https://www.spinics.net/lists/linux-block/msg18943.html Suggested-by: Tejun Heo <tj@kernel.org> Suggested-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Lee Tibbert <lee.tibbert@gmail.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Luca Miccio <lucmiccio@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
This commit is contained in:
@@ -20,12 +20,22 @@ for that device, by setting low_latency to 0. See Section 3 for
|
||||
details on how to configure BFQ for the desired tradeoff between
|
||||
latency and throughput, or on how to maximize throughput.
|
||||
|
||||
BFQ has a non-null overhead, which limits the maximum IOPS that the
|
||||
CPU can process for a device scheduled with BFQ. To give an idea of
|
||||
the limits on slow or average CPUs, here are BFQ limits for three
|
||||
different CPUs, on, respectively, an average laptop, an old desktop,
|
||||
and a cheap embedded system, in case full hierarchical support is
|
||||
enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set):
|
||||
BFQ has a non-null overhead, which limits the maximum IOPS that a CPU
|
||||
can process for a device scheduled with BFQ. To give an idea of the
|
||||
limits on slow or average CPUs, here are, first, the limits of BFQ for
|
||||
three different CPUs, on, respectively, an average laptop, an old
|
||||
desktop, and a cheap embedded system, in case full hierarchical
|
||||
support is enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set), but
|
||||
CONFIG_DEBUG_BLK_CGROUP is not set (Section 4-2):
|
||||
- Intel i7-4850HQ: 400 KIOPS
|
||||
- AMD A8-3850: 250 KIOPS
|
||||
- ARM CortexTM-A53 Octa-core: 80 KIOPS
|
||||
|
||||
If CONFIG_DEBUG_BLK_CGROUP is set (and of course full hierarchical
|
||||
support is enabled), then the sustainable throughput with BFQ
|
||||
decreases, because all blkio.bfq* statistics are created and updated
|
||||
(Section 4-2). For BFQ, this leads to the following maximum
|
||||
sustainable throughputs, on the same systems as above:
|
||||
- Intel i7-4850HQ: 310 KIOPS
|
||||
- AMD A8-3850: 200 KIOPS
|
||||
- ARM CortexTM-A53 Octa-core: 56 KIOPS
|
||||
@@ -505,6 +515,22 @@ BFQ-specific files is "blkio.bfq." or "io.bfq." For example, the group
|
||||
parameter to set the weight of a group with BFQ is blkio.bfq.weight
|
||||
or io.bfq.weight.
|
||||
|
||||
As for cgroups-v1 (blkio controller), the exact set of stat files
|
||||
created, and kept up-to-date by bfq, depends on whether
|
||||
CONFIG_DEBUG_BLK_CGROUP is set. If it is set, then bfq creates all
|
||||
the stat files documented in
|
||||
Documentation/cgroup-v1/blkio-controller.txt. If, instead,
|
||||
CONFIG_DEBUG_BLK_CGROUP is not set, then bfq creates only the files
|
||||
blkio.bfq.io_service_bytes
|
||||
blkio.bfq.io_service_bytes_recursive
|
||||
blkio.bfq.io_serviced
|
||||
blkio.bfq.io_serviced_recursive
|
||||
|
||||
The value of CONFIG_DEBUG_BLK_CGROUP greatly influences the maximum
|
||||
throughput sustainable with bfq, because updating the blkio.bfq.*
|
||||
stats is rather costly, especially for some of the stats enabled by
|
||||
CONFIG_DEBUG_BLK_CGROUP.
|
||||
|
||||
Parameters to set
|
||||
-----------------
|
||||
|
||||
|
Reference in New Issue
Block a user