Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26572

Improve simple multibyte collation performance on the ASCII range

Details

    Description

      Binary collations to be improved

      The following binary multi-byte collations (together with their _nopad_bin counterparts):

      • big5_bin
      • cp932_bin
      • eucjpms_bin
      • euckr_bin
      • gb2312_bin
      • gbk_bin
      • sjis_bin
      • ujis_bin
      • utf8mb3_bin
      • utf8mb4_bin

      can improve their performance if in this code in strcoll.ic:

      static int
      MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                    const uchar *a, size_t a_length, 
                                    const uchar *b, size_t b_length)
      {
        const uchar *a_end= a + a_length;
        const uchar *b_end= b + b_length;
        for ( ; ; )
        {
          int a_weight, b_weight, res;
          uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
          ...
      

      we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.

      Case insensitive collations to be improved

      Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):

      • utf8mb3_general_ci
      • utf8mb3_general_mysql500_ci
      • utf8mb4_general_ci
      • cp932_japanese_ci
      • eucjpms_japanese_ci
      • euckr_korean_ci
      • sjis_japanese_ci
      • ujis_japanese_ci

      can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters [a-z] to their upper case counterparts [A-Z], and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:

      • Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
      • Load the two strings into two uint32 or uint64 numbers
      • Perform bulk conversion of all bytes in the two numbers from [61..7A] to [41..5A] (i.e. from [a-z] to [A-Z])
      • Compare the numbers and return if they are different
      • Increment pointers to 4 or 8 and continue the loop

      Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

      Requirements

      The performance of the low level comparison functions can be measured by the BENCHMARK() SQL functions, e.g.:

      SET NAMES utf8mb3 COLLATE utf8mb3_general_ci;
      SELECT BENCHMARK(10000000,'aaaaaaaaaaaaaaaa'='aaaaaaaaaaaaaaaa');
      

      The expected performance improvement on the pure ASCII data (for strings with octet length >= 4) is between 2 and 3 times (depending on the exact length and collation).

      Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:

      • strings with multi-byte characters
      • short strings 1..3 bytes long

      Collations that won't be changed in this task

      8bit case insensitive collations

      MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

      Three Chinese case insensitive collations

      Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):

      • big5_chinese_ci
      • gb2312_chinese_ci
      • gbk_chinese_ci

      because all these three collations additionally change the order of some ASCII punctuation characters:

      Weight Character name Character
      0x5B U+005D RIGHT SQUARE BRACKET ]
      0x5C U+005B LEFT SQUARE BRACKET [
      0x5D U+005C REVERSE SOLIDUS |

      So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

      Case insensitive _general_ci collations for ucs2, utf16, utf32

      These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

      Attachments

        Issue Links

          Activity

            bar Alexander Barkov created issue -
            bar Alexander Barkov made changes -
            Field Original Value New Value
            Description h2. Binary collations to be improved

            The following binary collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            Note, under terms of this task we won't change the following multibyte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multibyte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            Note, under terms of this task we won't change the following multibyte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            bar Alexander Barkov made changes -
            Description h2. Binary collations to be improved

            The following binary multibyte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            Note, under terms of this task we won't change the following multibyte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimizes in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code.

            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            bar Alexander Barkov made changes -
            Description h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimizes in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code.

            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code.

            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            bar Alexander Barkov made changes -
            Description h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code.

            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            bar Alexander Barkov made changes -
            Description h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The changes must be done in a way not to bring any serios slow down for multi-byte data!


            h2. Collations that won't be changed in this task

            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because these collations additionally change the order of these ASCII characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The expected performance improvement on the pure ASCII range for strings 4 or more bytes long is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            bar Alexander Barkov made changes -
            Description h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The expected performance improvement on the pure ASCII range for strings 4 or more bytes long is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The expected performance improvement on the pure ASCII data (for strings with octet length >= 4) is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            bar Alexander Barkov made changes -
            Description h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The expected performance improvement on the pure ASCII data (for strings with octet length >= 4) is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The performance of the low level comparison functions can be measured by the {{BENCHMARK()}} SQL functions, e.g.:
            {code:sql}
            SET NAMES utf8mb3 COLLATE utf8mb3_general_ci;
            SELECT BENCHMARK(10000000,'aaaaaaaaaaaaaaaa'='aaaaaaaaaaaaaaaa');
            {code}

            The expected performance improvement on the pure ASCII data (for strings with octet length >= 4) is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            bar Alexander Barkov made changes -
            Description h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The performance of the low level comparison functions can be measured by the {{BENCHMARK()}} SQL functions, e.g.:
            {code:sql}
            SET NAMES utf8mb3 COLLATE utf8mb3_general_ci;
            SELECT BENCHMARK(10000000,'aaaaaaaaaaaaaaaa'='aaaaaaaaaaaaaaaa');
            {code}

            The expected performance improvement on the pure ASCII data (for strings with octet length >= 4) is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The performance of the low level comparison functions can be measured by the {{BENCHMARK()}} SQL functions, e.g.:
            {code:sql}
            SET NAMES utf8mb3 COLLATE utf8mb3_general_ci;
            SELECT BENCHMARK(10000000,'aaaaaaaaaaaaaaaa'='aaaaaaaaaaaaaaaa');
            {code}

            The expected performance improvement on the pure ASCII data (for strings with octet length >= 4) is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive _general_ collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            bar Alexander Barkov made changes -
            Description h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The performance of the low level comparison functions can be measured by the {{BENCHMARK()}} SQL functions, e.g.:
            {code:sql}
            SET NAMES utf8mb3 COLLATE utf8mb3_general_ci;
            SELECT BENCHMARK(10000000,'aaaaaaaaaaaaaaaa'='aaaaaaaaaaaaaaaa');
            {code}

            The expected performance improvement on the pure ASCII data (for strings with octet length >= 4) is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive _general_ collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            h2. Binary collations to be improved

            The following binary multi-byte collations (together with their _nopad_bin counterparts):
            - big5_bin
            - cp932_bin
            - eucjpms_bin
            - euckr_bin
            - gb2312_bin
            - gbk_bin
            - sjis_bin
            - ujis_bin
            - utf8mb3_bin
            - utf8mb4_bin

            can improve their performance if in this code in strcoll.ic:

            {code:cpp}
            static int
            MY_FUNCTION_NAME(strnncollsp)(CHARSET_INFO *cs __attribute__((unused)),
                                          const uchar *a, size_t a_length,
                                          const uchar *b, size_t b_length)
            {
              const uchar *a_end= a + a_length;
              const uchar *b_end= b + b_length;
              for ( ; ; )
              {
                int a_weight, b_weight, res;
                uint a_wlen= MY_FUNCTION_NAME(scan_weight)(&a_weight, a, a_end);
                ...
            {code}
            we catch pure ASCII and try to handle 4 or even 8 bytes in one iteration by loading string data into big-endian uint32 or uint64 numbers, then comparing these two numbers.


            h2. Case insensitive collations to be improved

            Additionally, the following case insensitive multibyte collations (and their _nopad_ci counteparts):
            - utf8mb3_general_ci
            - utf8mb3_general_mysql500_ci
            - utf8mb4_general_ci
            - cp932_japanese_ci
            - eucjpms_japanese_ci
            - euckr_korean_ci
            - sjis_japanese_ci
            - ujis_japanese_ci

            can use the same idea because for ASCII they perform only a trivial mapping from lower case Latin letters {{[a-z]}} to their upper case counterparts {{[A-Z]}}, and after this mapping done the comparison is performed in binary style. These collations can do the following on every iteration step:
            - Test the leading 4 or 8 bytes in the two strings for pure ASCII data and go to the old code on failure (to handle multi-byte characters)
            - Load the two strings into two uint32 or uint64 numbers
            - Perform bulk conversion of all bytes in the two numbers from {{[61..7A]}} to {{[41..5A]}} (i.e. from {{[a-z]}} to {{[A-Z]}})
            - Compare the numbers and return if they are different
            - Increment pointers to 4 or 8 and continue the loop

            Note, the exact way of bulk conversion of numbers to upper case is to be found out by the developer.

            h2. Requirements

            The performance of the low level comparison functions can be measured by the {{BENCHMARK()}} SQL functions, e.g.:
            {code:sql}
            SET NAMES utf8mb3 COLLATE utf8mb3_general_ci;
            SELECT BENCHMARK(10000000,'aaaaaaaaaaaaaaaa'='aaaaaaaaaaaaaaaa');
            {code}

            The expected performance improvement on the pure ASCII data (for strings with octet length >= 4) is between 2 and 3 times (depending on the exact length and collation).

            Note, the changes must be done in a way not to bring any serious (more than 10%) slow down for:
            - strings with multi-byte characters
            - short strings 1..3 bytes long


            h2. Collations that won't be changed in this task

            h3. 8bit case insensitive collations
            MariaDB has a number of 8bit case insensitive collations with trivial toupper mapping on the ASCII range. So they can get optimized in the same way. But we'll improve these collations under terms of a separate task because they don't use the mentioned code and have their own implementations.

            h3. Three Chinese case insensitive collations
            Also, under terms of this task we won't change the following multi-byte case insensitive collations (and their _nopad_ci counterparts):
            - big5_chinese_ci
            - gb2312_chinese_ci
            - gbk_chinese_ci

            because all these three collations additionally change the order of some ASCII punctuation characters:

            ||Weight||Character name||Character||
            |0x5B|U+005D RIGHT SQUARE BRACKET|]|
            |0x5C|U+005B LEFT SQUARE BRACKET|[|
            |0x5D|U+005C REVERSE SOLIDUS|\|

            So on the bulk conversion step they need more efforts and the proposed optimization may not be efficient. These collations will be improved later under terms of a separate task.

            h3. Case insensitive {{_general_ci}} collations for ucs2, utf16, utf32
            These character sets have separate implementations and don't use the mentioned code. They'll be improved under terms of a separate task.

            bar Alexander Barkov made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            bar Alexander Barkov made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]
            bar Alexander Barkov made changes -
            Assignee Alexander Barkov [ bar ] Sergei Golubchik [ serg ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            serg Sergei Golubchik made changes -
            Assignee Sergei Golubchik [ serg ] Alexander Barkov [ bar ]
            Status In Review [ 10002 ] Stalled [ 10000 ]
            bar Alexander Barkov added a comment - - edited

            Testing

            Tests were done with help a standalone bechmarking program calling cs->cset->strnncollsp() in a loop.
            The benchmark program measures then time needed for one million calls.

            The test data is attached to this issue as a file all.txt.

            The test data set included strings with different lengths (1,2,3,4,8 and 16 characters) and different repertoires:

            Repertoire    Unicode Range    Comment
            ----------    --------------   -------
            ascii         U+0000..U+007F   7bit ASCII
            cyr           U+0400..U+04FF   Cyrillic
            cjk           U+4E00..U+9FFF   CJK Unified Ideographs
            lat2          U+0080..U+024F   Latin Supplement/Extended (use 2 bytes in utf8)
            lat3          U+1E00..U+1EFF   Latin Extended Additional (use 3 bytes in utf8)
            lat12                          Mixture of ascii and lat2
            lat23                          Mixture of lat2 and lat3
            

            The benchmark program was run two times: before the patch and after the patch.
            The performance improvement was calculated as "time before the patch" divided to "time after the patch".

            The numbers in the tables below mean the following:

            • If the number is 1 or more, it means performance improvement (the more - the better)
            • If the number is less than 1, it means performance degradation (the less - the worse)
            • If a cell contains NULL, that means the character set for the given collation cannot encode the given repertoire

            CHAR_LENGTH>=4

            In tests where both strings have char_length>=4 the benchmark program demonstrated the following average time difference (old time divided to new time):

            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            | coll                      | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 |
            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            | big5_bin                  |     2.323 |   1.156 |   1.072 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | big5_nopad_bin            |     2.818 |   1.148 |   1.131 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_bin                 |     3.019 |   1.064 |   1.008 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_japanese_ci         |     2.065 |   1.061 |   1.150 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_japanese_nopad_ci   |     2.312 |   0.931 |   0.967 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_nopad_bin           |     3.034 |   1.106 |   0.962 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | eucjpms_bin               |     3.470 |   1.535 |   1.599 |    1.186 |     1.440 |     NULL |      NULL |      NULL |
            | eucjpms_japanese_ci       |     1.822 |   1.009 |   1.052 |    0.927 |     1.093 |     NULL |      NULL |      NULL |
            | eucjpms_japanese_nopad_ci |     1.595 |   0.930 |   0.883 |    0.957 |     0.957 |     NULL |      NULL |      NULL |
            | eucjpms_nopad_bin         |     2.339 |   1.143 |   1.199 |    0.884 |     1.071 |     NULL |      NULL |      NULL |
            | euckr_bin                 |     1.797 |   0.710 |   0.666 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_korean_ci           |     1.649 |   0.855 |   0.788 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_korean_nopad_ci     |     1.718 |   0.958 |   0.910 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_nopad_bin           |     2.535 |   0.951 |   0.964 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | gb2312_bin                |     2.891 |   0.924 |   0.803 |    0.766 |     1.207 |     NULL |      NULL |      NULL |
            | gb2312_nopad_bin          |     2.087 |   1.108 |   1.059 |    1.055 |     1.117 |     NULL |      NULL |      NULL |
            | gbk_bin                   |     3.236 |   0.956 |   0.968 |    0.982 |     1.195 |     NULL |      NULL |      NULL |
            | gbk_nopad_bin             |     2.215 |   0.912 |   0.856 |    0.817 |     1.049 |     NULL |      NULL |      NULL |
            | sjis_bin                  |     2.589 |   0.939 |   0.977 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_japanese_ci          |     2.036 |   1.043 |   1.076 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_japanese_nopad_ci    |     2.053 |   1.010 |   1.157 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_nopad_bin            |     2.850 |   1.151 |   1.223 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | ujis_bin                  |     3.271 |   1.350 |   1.463 |    1.184 |     1.410 |     NULL |      NULL |      NULL |
            | ujis_japanese_ci          |     2.097 |   0.961 |   1.040 |    1.052 |     1.155 |     NULL |      NULL |      NULL |
            | ujis_japanese_nopad_ci    |     1.577 |   0.896 |   0.879 |    0.892 |     0.943 |     NULL |      NULL |      NULL |
            | ujis_nopad_bin            |     2.263 |   1.104 |   1.155 |    0.893 |     1.106 |     NULL |      NULL |      NULL |
            | utf8mb3_bin               |     2.708 |   1.348 |   1.198 |    1.345 |     1.534 |    1.124 |     1.403 |     1.257 |
            | utf8mb3_general_ci        |     2.539 |   1.142 |   0.958 |    1.139 |     1.257 |    1.269 |     1.102 |     1.241 |
            | utf8mb4_bin               |     2.713 |   1.110 |   1.116 |    1.063 |     1.434 |    1.120 |     1.336 |     1.119 |
            | utf8mb4_general_ci        |     2.357 |   1.099 |   1.006 |    1.100 |     1.290 |    0.999 |     1.093 |     1.048 |
            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+

            CHAR_LENGTH>=16

            On long strings with CHAR_LENGTH>=16 the patch demonstrates the best performance improvement on the ascii, lat12 and lat13 repertoirs:

            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            | coll                      | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 |
            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            | big5_bin                  |     2.323 |   1.156 |   1.072 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | big5_nopad_bin            |     2.818 |   1.148 |   1.131 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_bin                 |     3.019 |   1.064 |   1.008 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_japanese_ci         |     2.065 |   1.061 |   1.150 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_japanese_nopad_ci   |     2.312 |   0.931 |   0.967 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_nopad_bin           |     3.034 |   1.106 |   0.962 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | eucjpms_bin               |     3.470 |   1.535 |   1.599 |    1.186 |     1.440 |     NULL |      NULL |      NULL |
            | eucjpms_japanese_ci       |     1.822 |   1.009 |   1.052 |    0.927 |     1.093 |     NULL |      NULL |      NULL |
            | eucjpms_japanese_nopad_ci |     1.595 |   0.930 |   0.883 |    0.957 |     0.957 |     NULL |      NULL |      NULL |
            | eucjpms_nopad_bin         |     2.339 |   1.143 |   1.199 |    0.884 |     1.071 |     NULL |      NULL |      NULL |
            | euckr_bin                 |     1.797 |   0.710 |   0.666 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_korean_ci           |     1.649 |   0.855 |   0.788 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_korean_nopad_ci     |     1.718 |   0.958 |   0.910 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_nopad_bin           |     2.535 |   0.951 |   0.964 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | gb2312_bin                |     2.891 |   0.924 |   0.803 |    0.766 |     1.207 |     NULL |      NULL |      NULL |
            | gb2312_nopad_bin          |     2.087 |   1.108 |   1.059 |    1.055 |     1.117 |     NULL |      NULL |      NULL |
            | gbk_bin                   |     3.236 |   0.956 |   0.968 |    0.982 |     1.195 |     NULL |      NULL |      NULL |
            | gbk_nopad_bin             |     2.215 |   0.912 |   0.856 |    0.817 |     1.049 |     NULL |      NULL |      NULL |
            | sjis_bin                  |     2.589 |   0.939 |   0.977 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_japanese_ci          |     2.036 |   1.043 |   1.076 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_japanese_nopad_ci    |     2.053 |   1.010 |   1.157 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_nopad_bin            |     2.850 |   1.151 |   1.223 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | ujis_bin                  |     3.271 |   1.350 |   1.463 |    1.184 |     1.410 |     NULL |      NULL |      NULL |
            | ujis_japanese_ci          |     2.097 |   0.961 |   1.040 |    1.052 |     1.155 |     NULL |      NULL |      NULL |
            | ujis_japanese_nopad_ci    |     1.577 |   0.896 |   0.879 |    0.892 |     0.943 |     NULL |      NULL |      NULL |
            | ujis_nopad_bin            |     2.263 |   1.104 |   1.155 |    0.893 |     1.106 |     NULL |      NULL |      NULL |
            | utf8mb3_bin               |     2.708 |   1.348 |   1.198 |    1.345 |     1.534 |    1.124 |     1.403 |     1.257 |
            | utf8mb3_general_ci        |     2.539 |   1.142 |   0.958 |    1.139 |     1.257 |    1.269 |     1.102 |     1.241 |
            | utf8mb4_bin               |     2.713 |   1.110 |   1.116 |    1.063 |     1.434 |    1.120 |     1.336 |     1.119 |
            | utf8mb4_general_ci        |     2.357 |   1.099 |   1.006 |    1.100 |     1.290 |    0.999 |     1.093 |     1.048 |
            | memcmp                    |     0.994 |   0.999 |   1.009 |    0.997 |     1.001 |    1.020 |     1.004 |     0.996 |
            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            

            CHAR_LENGTH<4

            On short strings optimization was not done. The expected degradation should not be more than 5%.

            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            | coll                      | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 |
            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            | big5_bin                  |     1.001 |   1.169 |   1.127 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | big5_nopad_bin            |     1.014 |   1.151 |   1.123 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_bin                 |     0.865 |   1.101 |   1.078 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_japanese_ci         |     1.032 |   1.031 |   1.053 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_japanese_nopad_ci   |     0.985 |   0.893 |   0.919 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | cp932_nopad_bin           |     1.032 |   1.152 |   1.086 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | eucjpms_bin               |     1.091 |   1.431 |   1.438 |    1.188 |     1.050 |     NULL |      NULL |      NULL |
            | eucjpms_japanese_ci       |     1.104 |   1.018 |   1.020 |    0.932 |     0.966 |     NULL |      NULL |      NULL |
            | eucjpms_japanese_nopad_ci |     0.843 |   0.914 |   0.909 |    0.959 |     0.877 |     NULL |      NULL |      NULL |
            | eucjpms_nopad_bin         |     1.056 |   1.161 |   1.187 |    0.887 |     0.934 |     NULL |      NULL |      NULL |
            | euckr_bin                 |     0.974 |   0.727 |   0.734 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_korean_ci           |     0.818 |   0.855 |   0.840 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_korean_nopad_ci     |     0.698 |   0.938 |   0.934 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | euckr_nopad_bin           |     1.014 |   0.906 |   0.922 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | gb2312_bin                |     1.043 |   0.964 |   0.927 |    0.888 |     1.068 |     NULL |      NULL |      NULL |
            | gb2312_nopad_bin          |     1.018 |   1.126 |   1.088 |    1.079 |     1.192 |     NULL |      NULL |      NULL |
            | gbk_bin                   |     1.088 |   0.959 |   0.953 |    0.976 |     1.046 |     NULL |      NULL |      NULL |
            | gbk_nopad_bin             |     1.041 |   0.960 |   0.954 |    0.929 |     1.036 |     NULL |      NULL |      NULL |
            | sjis_bin                  |     0.770 |   0.947 |   0.946 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_japanese_ci          |     0.845 |   1.014 |   1.022 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_japanese_nopad_ci    |     0.835 |   0.940 |   0.986 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | sjis_nopad_bin            |     1.070 |   1.089 |   1.054 |     NULL |      NULL |     NULL |      NULL |      NULL |
            | ujis_bin                  |     1.097 |   1.275 |   1.299 |    1.212 |     1.073 |     NULL |      NULL |      NULL |
            | ujis_japanese_ci          |     1.116 |   0.977 |   0.984 |    1.045 |     0.995 |     NULL |      NULL |      NULL |
            | ujis_japanese_nopad_ci    |     0.813 |   0.887 |   0.871 |    0.869 |     0.839 |     NULL |      NULL |      NULL |
            | ujis_nopad_bin            |     1.010 |   1.088 |   1.123 |    0.904 |     0.971 |     NULL |      NULL |      NULL |
            | utf8mb3_bin               |     1.270 |   1.306 |   1.170 |    1.314 |     1.218 |    1.129 |     1.218 |     1.252 |
            | utf8mb3_general_ci        |     0.928 |   1.040 |   0.949 |    1.038 |     0.952 |    1.104 |     0.956 |     1.096 |
            | utf8mb4_bin               |     1.287 |   1.052 |   1.076 |    1.044 |     1.110 |    1.112 |     1.225 |     1.127 |
            | utf8mb4_general_ci        |     0.906 |   1.123 |   1.028 |    1.126 |     1.029 |    1.041 |     0.999 |     1.091 |
            +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            

            Microbenchmark comments

            In all test results we can observe some noise on top of the actual performance changes directly caused by the changes in the code.

            The noise is caused by the fact that after changes in one function, the linker can change the order of all functions in the object file (and in the final binary), and this can visibly affect the performance of every function handling an individual collation (up to 20%). During run time, the closer a function resides in RAM to the benchmark loop - the faster it works. It relates to CPU caches.

            The noise can be different in the server (instead of the standalone program).
            The noise can change between server versions.

            So in addition to the individual numbers per collations, an average performance improvement on all collations is also important

            • For strings with CHAR_LENGTH>=4:

              +-----------+---------+---------+----------+-----------+----------+-----------+-----------+
              | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 |
              +-----------+---------+---------+----------+-----------+----------+-----------+-----------+
              |     2.138 |   1.035 |   1.028 |    1.023 |     1.164 |    1.096 |     1.210 |     1.129 |
              +-----------+---------+---------+----------+-----------+----------+-----------+-----------+
              

            • For strings with CHAR_LENGTH<4:

              +-----------+---------+---------+----------+-----------+----------+-----------+-----------+
              | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 |
              +-----------+---------+---------+----------+-----------+----------+-----------+-----------+
              |     0.997 |   1.024 |   1.016 |    1.025 |     1.017 |    1.078 |     1.103 |     1.117 |
              +-----------+---------+---------+----------+-----------+----------+-----------+-----------+
              

            bar Alexander Barkov added a comment - - edited Testing Tests were done with help a standalone bechmarking program calling cs->cset->strnncollsp() in a loop. The benchmark program measures then time needed for one million calls. The test data is attached to this issue as a file all.txt . The test data set included strings with different lengths (1,2,3,4,8 and 16 characters) and different repertoires: Repertoire Unicode Range Comment ---------- -------------- ------- ascii U+0000..U+007F 7bit ASCII cyr U+0400..U+04FF Cyrillic cjk U+4E00..U+9FFF CJK Unified Ideographs lat2 U+0080..U+024F Latin Supplement/Extended (use 2 bytes in utf8) lat3 U+1E00..U+1EFF Latin Extended Additional (use 3 bytes in utf8) lat12 Mixture of ascii and lat2 lat23 Mixture of lat2 and lat3 The benchmark program was run two times: before the patch and after the patch. The performance improvement was calculated as "time before the patch" divided to "time after the patch". The numbers in the tables below mean the following: If the number is 1 or more, it means performance improvement (the more - the better) If the number is less than 1, it means performance degradation (the less - the worse) If a cell contains NULL, that means the character set for the given collation cannot encode the given repertoire CHAR_LENGTH>=4 In tests where both strings have char_length>=4 the benchmark program demonstrated the following average time difference (old time divided to new time): +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | coll | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 | +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | big5_bin | 2.323 | 1.156 | 1.072 | NULL | NULL | NULL | NULL | NULL | | big5_nopad_bin | 2.818 | 1.148 | 1.131 | NULL | NULL | NULL | NULL | NULL | | cp932_bin | 3.019 | 1.064 | 1.008 | NULL | NULL | NULL | NULL | NULL | | cp932_japanese_ci | 2.065 | 1.061 | 1.150 | NULL | NULL | NULL | NULL | NULL | | cp932_japanese_nopad_ci | 2.312 | 0.931 | 0.967 | NULL | NULL | NULL | NULL | NULL | | cp932_nopad_bin | 3.034 | 1.106 | 0.962 | NULL | NULL | NULL | NULL | NULL | | eucjpms_bin | 3.470 | 1.535 | 1.599 | 1.186 | 1.440 | NULL | NULL | NULL | | eucjpms_japanese_ci | 1.822 | 1.009 | 1.052 | 0.927 | 1.093 | NULL | NULL | NULL | | eucjpms_japanese_nopad_ci | 1.595 | 0.930 | 0.883 | 0.957 | 0.957 | NULL | NULL | NULL | | eucjpms_nopad_bin | 2.339 | 1.143 | 1.199 | 0.884 | 1.071 | NULL | NULL | NULL | | euckr_bin | 1.797 | 0.710 | 0.666 | NULL | NULL | NULL | NULL | NULL | | euckr_korean_ci | 1.649 | 0.855 | 0.788 | NULL | NULL | NULL | NULL | NULL | | euckr_korean_nopad_ci | 1.718 | 0.958 | 0.910 | NULL | NULL | NULL | NULL | NULL | | euckr_nopad_bin | 2.535 | 0.951 | 0.964 | NULL | NULL | NULL | NULL | NULL | | gb2312_bin | 2.891 | 0.924 | 0.803 | 0.766 | 1.207 | NULL | NULL | NULL | | gb2312_nopad_bin | 2.087 | 1.108 | 1.059 | 1.055 | 1.117 | NULL | NULL | NULL | | gbk_bin | 3.236 | 0.956 | 0.968 | 0.982 | 1.195 | NULL | NULL | NULL | | gbk_nopad_bin | 2.215 | 0.912 | 0.856 | 0.817 | 1.049 | NULL | NULL | NULL | | sjis_bin | 2.589 | 0.939 | 0.977 | NULL | NULL | NULL | NULL | NULL | | sjis_japanese_ci | 2.036 | 1.043 | 1.076 | NULL | NULL | NULL | NULL | NULL | | sjis_japanese_nopad_ci | 2.053 | 1.010 | 1.157 | NULL | NULL | NULL | NULL | NULL | | sjis_nopad_bin | 2.850 | 1.151 | 1.223 | NULL | NULL | NULL | NULL | NULL | | ujis_bin | 3.271 | 1.350 | 1.463 | 1.184 | 1.410 | NULL | NULL | NULL | | ujis_japanese_ci | 2.097 | 0.961 | 1.040 | 1.052 | 1.155 | NULL | NULL | NULL | | ujis_japanese_nopad_ci | 1.577 | 0.896 | 0.879 | 0.892 | 0.943 | NULL | NULL | NULL | | ujis_nopad_bin | 2.263 | 1.104 | 1.155 | 0.893 | 1.106 | NULL | NULL | NULL | | utf8mb3_bin | 2.708 | 1.348 | 1.198 | 1.345 | 1.534 | 1.124 | 1.403 | 1.257 | | utf8mb3_general_ci | 2.539 | 1.142 | 0.958 | 1.139 | 1.257 | 1.269 | 1.102 | 1.241 | | utf8mb4_bin | 2.713 | 1.110 | 1.116 | 1.063 | 1.434 | 1.120 | 1.336 | 1.119 | | utf8mb4_general_ci | 2.357 | 1.099 | 1.006 | 1.100 | 1.290 | 0.999 | 1.093 | 1.048 | +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ CHAR_LENGTH>=16 On long strings with CHAR_LENGTH>=16 the patch demonstrates the best performance improvement on the ascii, lat12 and lat13 repertoirs: +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | coll | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 | +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | big5_bin | 2.323 | 1.156 | 1.072 | NULL | NULL | NULL | NULL | NULL | | big5_nopad_bin | 2.818 | 1.148 | 1.131 | NULL | NULL | NULL | NULL | NULL | | cp932_bin | 3.019 | 1.064 | 1.008 | NULL | NULL | NULL | NULL | NULL | | cp932_japanese_ci | 2.065 | 1.061 | 1.150 | NULL | NULL | NULL | NULL | NULL | | cp932_japanese_nopad_ci | 2.312 | 0.931 | 0.967 | NULL | NULL | NULL | NULL | NULL | | cp932_nopad_bin | 3.034 | 1.106 | 0.962 | NULL | NULL | NULL | NULL | NULL | | eucjpms_bin | 3.470 | 1.535 | 1.599 | 1.186 | 1.440 | NULL | NULL | NULL | | eucjpms_japanese_ci | 1.822 | 1.009 | 1.052 | 0.927 | 1.093 | NULL | NULL | NULL | | eucjpms_japanese_nopad_ci | 1.595 | 0.930 | 0.883 | 0.957 | 0.957 | NULL | NULL | NULL | | eucjpms_nopad_bin | 2.339 | 1.143 | 1.199 | 0.884 | 1.071 | NULL | NULL | NULL | | euckr_bin | 1.797 | 0.710 | 0.666 | NULL | NULL | NULL | NULL | NULL | | euckr_korean_ci | 1.649 | 0.855 | 0.788 | NULL | NULL | NULL | NULL | NULL | | euckr_korean_nopad_ci | 1.718 | 0.958 | 0.910 | NULL | NULL | NULL | NULL | NULL | | euckr_nopad_bin | 2.535 | 0.951 | 0.964 | NULL | NULL | NULL | NULL | NULL | | gb2312_bin | 2.891 | 0.924 | 0.803 | 0.766 | 1.207 | NULL | NULL | NULL | | gb2312_nopad_bin | 2.087 | 1.108 | 1.059 | 1.055 | 1.117 | NULL | NULL | NULL | | gbk_bin | 3.236 | 0.956 | 0.968 | 0.982 | 1.195 | NULL | NULL | NULL | | gbk_nopad_bin | 2.215 | 0.912 | 0.856 | 0.817 | 1.049 | NULL | NULL | NULL | | sjis_bin | 2.589 | 0.939 | 0.977 | NULL | NULL | NULL | NULL | NULL | | sjis_japanese_ci | 2.036 | 1.043 | 1.076 | NULL | NULL | NULL | NULL | NULL | | sjis_japanese_nopad_ci | 2.053 | 1.010 | 1.157 | NULL | NULL | NULL | NULL | NULL | | sjis_nopad_bin | 2.850 | 1.151 | 1.223 | NULL | NULL | NULL | NULL | NULL | | ujis_bin | 3.271 | 1.350 | 1.463 | 1.184 | 1.410 | NULL | NULL | NULL | | ujis_japanese_ci | 2.097 | 0.961 | 1.040 | 1.052 | 1.155 | NULL | NULL | NULL | | ujis_japanese_nopad_ci | 1.577 | 0.896 | 0.879 | 0.892 | 0.943 | NULL | NULL | NULL | | ujis_nopad_bin | 2.263 | 1.104 | 1.155 | 0.893 | 1.106 | NULL | NULL | NULL | | utf8mb3_bin | 2.708 | 1.348 | 1.198 | 1.345 | 1.534 | 1.124 | 1.403 | 1.257 | | utf8mb3_general_ci | 2.539 | 1.142 | 0.958 | 1.139 | 1.257 | 1.269 | 1.102 | 1.241 | | utf8mb4_bin | 2.713 | 1.110 | 1.116 | 1.063 | 1.434 | 1.120 | 1.336 | 1.119 | | utf8mb4_general_ci | 2.357 | 1.099 | 1.006 | 1.100 | 1.290 | 0.999 | 1.093 | 1.048 | | memcmp | 0.994 | 0.999 | 1.009 | 0.997 | 1.001 | 1.020 | 1.004 | 0.996 | +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ CHAR_LENGTH<4 On short strings optimization was not done. The expected degradation should not be more than 5%. +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | coll | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 | +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | big5_bin | 1.001 | 1.169 | 1.127 | NULL | NULL | NULL | NULL | NULL | | big5_nopad_bin | 1.014 | 1.151 | 1.123 | NULL | NULL | NULL | NULL | NULL | | cp932_bin | 0.865 | 1.101 | 1.078 | NULL | NULL | NULL | NULL | NULL | | cp932_japanese_ci | 1.032 | 1.031 | 1.053 | NULL | NULL | NULL | NULL | NULL | | cp932_japanese_nopad_ci | 0.985 | 0.893 | 0.919 | NULL | NULL | NULL | NULL | NULL | | cp932_nopad_bin | 1.032 | 1.152 | 1.086 | NULL | NULL | NULL | NULL | NULL | | eucjpms_bin | 1.091 | 1.431 | 1.438 | 1.188 | 1.050 | NULL | NULL | NULL | | eucjpms_japanese_ci | 1.104 | 1.018 | 1.020 | 0.932 | 0.966 | NULL | NULL | NULL | | eucjpms_japanese_nopad_ci | 0.843 | 0.914 | 0.909 | 0.959 | 0.877 | NULL | NULL | NULL | | eucjpms_nopad_bin | 1.056 | 1.161 | 1.187 | 0.887 | 0.934 | NULL | NULL | NULL | | euckr_bin | 0.974 | 0.727 | 0.734 | NULL | NULL | NULL | NULL | NULL | | euckr_korean_ci | 0.818 | 0.855 | 0.840 | NULL | NULL | NULL | NULL | NULL | | euckr_korean_nopad_ci | 0.698 | 0.938 | 0.934 | NULL | NULL | NULL | NULL | NULL | | euckr_nopad_bin | 1.014 | 0.906 | 0.922 | NULL | NULL | NULL | NULL | NULL | | gb2312_bin | 1.043 | 0.964 | 0.927 | 0.888 | 1.068 | NULL | NULL | NULL | | gb2312_nopad_bin | 1.018 | 1.126 | 1.088 | 1.079 | 1.192 | NULL | NULL | NULL | | gbk_bin | 1.088 | 0.959 | 0.953 | 0.976 | 1.046 | NULL | NULL | NULL | | gbk_nopad_bin | 1.041 | 0.960 | 0.954 | 0.929 | 1.036 | NULL | NULL | NULL | | sjis_bin | 0.770 | 0.947 | 0.946 | NULL | NULL | NULL | NULL | NULL | | sjis_japanese_ci | 0.845 | 1.014 | 1.022 | NULL | NULL | NULL | NULL | NULL | | sjis_japanese_nopad_ci | 0.835 | 0.940 | 0.986 | NULL | NULL | NULL | NULL | NULL | | sjis_nopad_bin | 1.070 | 1.089 | 1.054 | NULL | NULL | NULL | NULL | NULL | | ujis_bin | 1.097 | 1.275 | 1.299 | 1.212 | 1.073 | NULL | NULL | NULL | | ujis_japanese_ci | 1.116 | 0.977 | 0.984 | 1.045 | 0.995 | NULL | NULL | NULL | | ujis_japanese_nopad_ci | 0.813 | 0.887 | 0.871 | 0.869 | 0.839 | NULL | NULL | NULL | | ujis_nopad_bin | 1.010 | 1.088 | 1.123 | 0.904 | 0.971 | NULL | NULL | NULL | | utf8mb3_bin | 1.270 | 1.306 | 1.170 | 1.314 | 1.218 | 1.129 | 1.218 | 1.252 | | utf8mb3_general_ci | 0.928 | 1.040 | 0.949 | 1.038 | 0.952 | 1.104 | 0.956 | 1.096 | | utf8mb4_bin | 1.287 | 1.052 | 1.076 | 1.044 | 1.110 | 1.112 | 1.225 | 1.127 | | utf8mb4_general_ci | 0.906 | 1.123 | 1.028 | 1.126 | 1.029 | 1.041 | 0.999 | 1.091 | +---------------------------+-----------+---------+---------+----------+-----------+----------+-----------+-----------+ Microbenchmark comments In all test results we can observe some noise on top of the actual performance changes directly caused by the changes in the code. The noise is caused by the fact that after changes in one function, the linker can change the order of all functions in the object file (and in the final binary), and this can visibly affect the performance of every function handling an individual collation (up to 20%). During run time, the closer a function resides in RAM to the benchmark loop - the faster it works. It relates to CPU caches. The noise can be different in the server (instead of the standalone program). The noise can change between server versions. So in addition to the individual numbers per collations, an average performance improvement on all collations is also important For strings with CHAR_LENGTH>=4 : +-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 | +-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | 2.138 | 1.035 | 1.028 | 1.023 | 1.164 | 1.096 | 1.210 | 1.129 | +-----------+---------+---------+----------+-----------+----------+-----------+-----------+ For strings with CHAR_LENGTH<4 : +-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | avg_ascii | avg_cyr | avg_cjk | avg_lat2 | avg_lat12 | avg_lat3 | avg_lat13 | avg_lat23 | +-----------+---------+---------+----------+-----------+----------+-----------+-----------+ | 0.997 | 1.024 | 1.016 | 1.025 | 1.017 | 1.078 | 1.103 | 1.117 | +-----------+---------+---------+----------+-----------+----------+-----------+-----------+
            bar Alexander Barkov made changes -
            Attachment all.txt [ 58937 ]
            bar Alexander Barkov made changes -
            Assignee Alexander Barkov [ bar ] Sergei Golubchik [ serg ]
            Status Stalled [ 10000 ] In Review [ 10002 ]

            serg, please have a look into the patch with your review suggestions addressed:
            https://github.com/MariaDB/server/commit/0629711db43ec489a360d8f689b72fac66a2470b

            bar Alexander Barkov added a comment - serg , please have a look into the patch with your review suggestions addressed: https://github.com/MariaDB/server/commit/0629711db43ec489a360d8f689b72fac66a2470b
            serg Sergei Golubchik made changes -
            Fix Version/s 10.7.0 [ 26072 ]
            Fix Version/s 10.7 [ 24805 ]
            Resolution Fixed [ 1 ]
            Status In Review [ 10002 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 124897 ] MariaDB v4 [ 134466 ]
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -
            bar Alexander Barkov made changes -

            People

              serg Sergei Golubchik
              bar Alexander Barkov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.