crypto: powerpc - Add POWER8 optimised crc32c

Use the vector polynomial multiply-sum instructions in POWER8 to
speed up crc32c.

This is just over 41x faster than the slice-by-8 method that it
replaces. Measurements on a 4.1 GHz POWER8 show it sustaining
52 GiB/sec.

A simple btrfs write performance test:

    dd if=/dev/zero of=/mnt/tmpfile bs=1M count=4096
    sync

is over 3.7x faster.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This commit is contained in:
Anton Blanchard
2016-07-01 08:19:45 +10:00
committed by Herbert Xu
parent 151f25112f
commit 6dd7a82cc5
5 changed files with 1745 additions and 0 deletions

View File

@@ -437,6 +437,17 @@ config CRYPTO_CRC32C_INTEL
gain performance compared with software implementation.
Module will be crc32c-intel.
config CRYPT_CRC32C_VPMSUM
tristate "CRC32c CRC algorithm (powerpc64)"
depends on PPC64
select CRYPTO_HASH
select CRC32
help
CRC32c algorithm implemented using vector polynomial multiply-sum
(vpmsum) instructions, introduced in POWER8. Enable on POWER8
and newer processors for improved performance.
config CRYPTO_CRC32C_SPARC64
tristate "CRC32c CRC algorithm (SPARC64)"
depends on SPARC64