Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
Description
While this code has remained dormant for 18 years, libc implementers have used assembly features to gain improvements using architecture features optimized and by the buffer length like:
https://svnweb.freebsd.org/base/head/lib/libc/amd64/string/memcmp.S
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/aarch64/memcmp.S
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/powerpc/powerpc64/memcpy.S
While the perf results in commit message don't show a great improvement the varying length and codebase shows optimizations based on length.
Pull request: https://github.com/MariaDB/server/pull/698
Thanks a lot for the testing and results! Very interesting!
I don't want yet to remove the ptr_compare code as the results depends a lot on:
There is some benefit in knowing the exact length and alignment in advance instead of doing the check for each call. However your tests shows that for our most common platform the libc memcmp is better and I agree we should use that.
I prefer to keep the old code around, as it's still possible to do a faster memcmp based on the principle of ptr_cmp. I will take your excellent patch comment and disable the ptr_cmp code for now in preference of the standard memcmp.
@montywi