[MDEV-21816] Suboptimal implementation of my_convert() for ARM64 Created: 2020-02-25  Updated: 2020-10-02

Status: Open
Project: MariaDB Server
Component/s: Character Sets
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Alexey Kopytov Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: ARMv8, performance
Environment:

ARM64



 Description   

my_convert() in strings/ctype.c has this special optimization for i386 and x86_64:

#if defined(__i386__) || defined(__x86_64__)
  /*
    Special loop for i386, it allows to refer to a
    non-aligned memory block as UINT32, which makes
    it possible to copy four bytes at once. This
    gives about 10% performance improvement comparing
    to byte-by-byte loop.
  */
  for ( ; length >= 4; length-= 4, from+= 4, to+= 4)
  {
    if ((*(uint32*)from) & 0x80808080)
      break;
    *((uint32*) to)= *((const uint32*) from);
  }
#endif /* __i386__ */
 
... /* Unoptimized bytewise processing goes here */

Two things can be improved about that code:

1. 64-bit architectures like x86_64 could be optimized even further by processing 8 bytes at a time;
2. Other 64-bit architectures like aarch64 could also benefit from the same optimization, rather than process the input byte by byte.

In our case we see a few percent improvement in CPU-bound sysbench OLTP RO on AArch64, which is not too bad for such a simple optimization.



 Comments   
Comment by Alexey Kopytov [ 2020-02-25 ]

It's worth mentioning that improvements in benchmark numbers are seen with default-character-set=utf8.

Comment by Daniel Black [ 2020-10-02 ]

hmm. pretty sure everything allows unaligned access now. I remember it being a problem on ppc64le too.

I'm thinking a roll to 8 byte, maybe a preloop to get it aligned because afaik aligned access is still a little beit faster and on a string this could be biggish. The the compiler handle the 32 bit case.

FYI krunalbauskar

Generated at Thu Feb 08 09:10:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.