tile: optimize and clean up string functions

This change cleans up the string code in a number of ways:

- For memcpy(), fix bug in prefetch and increase distance to 3 lines;
  optimize for unaligned data; do all loads before wh64 to make memcpy
  safe for forward-overlapping calls; etc.  Performance is improved.

- Use new copy_byte() function on tilegx to spread a single byte value
  out into a full word using the shufflebytes instruction.

- Clean up header include ordering to be more canonical, and remove
  spurious #undefs of function names.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This commit is contained in:
Chris Metcalf
2013-08-01 15:52:17 -04:00
parent dd78bc11fb
commit c53c70a90f
8 changed files with 209 additions and 81 deletions

View File

@@ -36,7 +36,7 @@ void *memchr(const void *s, int c, size_t n)
p = (const uint64_t *)(s_int & -8);
/* Create eight copies of the byte for which we are looking. */
goal = 0x0101010101010101ULL * (uint8_t) c;
goal = copy_byte(c);
/* Read the first word, but munge it so that bytes before the array
* will not match goal.