tile: optimize and clean up string functions
This change cleans up the string code in a number of ways: - For memcpy(), fix bug in prefetch and increase distance to 3 lines; optimize for unaligned data; do all loads before wh64 to make memcpy safe for forward-overlapping calls; etc. Performance is improved. - Use new copy_byte() function on tilegx to spread a single byte value out into a full word using the shufflebytes instruction. - Clean up header include ordering to be more canonical, and remove spurious #undefs of function names. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This commit is contained in:
@@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright 2011 Tilera Corporation. All Rights Reserved.
|
||||
* Copyright 2013 Tilera Corporation. All Rights Reserved.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU General Public License
|
||||
@@ -31,3 +31,14 @@
|
||||
#define CFZ(x) __insn_clz(x)
|
||||
#define REVCZ(x) __insn_ctz(x)
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Create eight copies of the byte in a uint64_t. Byte Shuffle uses
|
||||
* the bytes of srcB as the index into the dest vector to select a
|
||||
* byte. With all indices of zero, the first byte is copied into all
|
||||
* the other bytes.
|
||||
*/
|
||||
static inline uint64_t copy_byte(uint8_t byte)
|
||||
{
|
||||
return __insn_shufflebytes(byte, 0, 0);
|
||||
}
|
||||
|
Reference in New Issue
Block a user