Details

Type: Task
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Fix Version/s: 10.10.1
Component/s: Character Sets
Labels:
- Preview_10.10

Description

Recently in ~~MDEV-26572~~ we significantly improved performance of some simple multi-byte collations on the ASCII range. The idea of the improvement was to handle multiple ASCII characters (4 or 8) at the same time.

Similar style of improvement can be done for UCA collations for utf8mb3 and utf8mb4.

It's hard to handle 4 or 8 bytes at the same time, because UCA is much more complex than simple collations improved in MDEV-26572. However, it's possible to handle at least 2 bytes at the same time. It will improve performance for:

2-byte sequences representing two consequent ASCII characters
2-byte sequences representing a single 2-byte character (such as accented Latin letters, Greek, Cyrillic, Armenian, Hebrew, Arabic).

Performance improvement, level 1:

For every bytes pair [00..FF][00..FF] which:

a. consists of two ASCII characters or makes a well-formed two-byte character
b. whose total weight string fits into 4 weights (concatenated weight string in case of two ASCII characters, or a single weight string in case of a two-byte character)
c. whose weight is context independent (i.e. does not depend on contractions or previous context pairs)

let's store weights in a new separate array of 64K elements of a new data type MY_UCA_2BYTES_ITEM, defined as follows:

#define MY_UCA_2BYTES_MAX_WEIGHT_SIZE (4+1) /* Including 0 terminator */

typedef struct my_uca_2bytes_item_t

  uint16 weight[MY_UCA_2BYTES_MAX_WEIGHT_SIZE];

} MY_UCA_2BYTES_ITEM;

so during scanner_next() we can scan two bytes at a time. Byte pairs that do not match the conditions a-c should be marked in this array as not applicable for optimization, so they can be scanned as before.

Performance improvement, level 2:

For every byte pair which is applicable for optimization in #1, and which produces only one or two weights, let's store weights in one more array of 64K elements of a new data type MY_UCA_WEIGHT2, defined as follows:

typedef struct my_uca_weight2_t

  uint16 weight[2];

} MY_UCA_WEIGHT2;

So in the beginning of strnncoll*() we can skip equal prefixes using an even more efficient loop. This loop will consume two bytes at a time. The loop will scan while the two bytes on both sides produce weight strings of equal length (i.e. one weight on both sides, or two weights on both sides).

This will allow to compare efficiently:

Context independent sequences consisting of two ASCII characters
Context independent 2-byte characters
Contractions consisting of two ASCII characters, e.g. Czech "ch".
Some tricky cases: "ss" vs "SHARP S" ("ss" produces two weights, 0xC39F also produces two weights)

Other Unicode character sets

Under terms of this patch we'll improve only utf8mb3 and utf8mb4. Other Unicode character sets (ucs2, utf16le, utf16, utf32) can also reuse the same optimization, however this will need some additional code tweaks. Let's do it later under terms of a separate task later.

Attachments

Issue Links

blocks

MDEV-25829 Change default Unicode collation to uca1400_ai_ci

Closed

relates to

MDEV-26572 Improve simple multibyte collation performance on the ASCII range

Closed

MDEV-32340 Improve performance of my_uca_level_booster_simple_prefix_cmp

Open

MDEV-27009 Add UCA-14.0.0 collations

Closed

MDEV-27265 Improve contraction performance in UCA collations

Closed

Activity

Ascending order - Click to sort in descending order

Alexander Barkov added a comment - 2021-12-15 15:54 - edited

Benchmarking

JIRA does not allow to put characters outside of BMP.
So in the below text this character:

U+1F44D THUMBS UP SIGN (_utf8 F09F918D)

was replaced to Z.

utf8mb4_general_ci (not changed - just for reference)

-- Warning up

SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;

DO BENCHMARK(10000000,strcmp('xxxx','xxxx'));

-- Benchmarking

SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;

DO BENCHMARK(10000000,strcmp('aaaa','aaaa'));

DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa'));

DO BENCHMARK(10000000,strcmp('яяяя','яяяя'));

DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ'));

DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ'));

MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaa','aaaa'));

Query OK, 0 rows affected (0.107 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa'));

Query OK, 0 rows affected (0.109 sec)

MariaDB [test]> SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;

Query OK, 0 rows affected (0.000 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaa','aaaa'));

Query OK, 0 rows affected (0.103 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('яяяя','яяяя'));

Query OK, 0 rows affected (0.192 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ'));

Query OK, 0 rows affected (0.266 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ'));

Query OK, 0 rows affected (0.260 sec)

Summary

A         B          Time   Comment

----      -----      ----   --------

aaaa      aaaa       0.103  ASCII

aaaaaaaa  aaaaaaaa   0.109  ASCII

яяяя      яяяя       0.192  2-byte Cyrillic

ắắắắ      ắắắắ       0.266  3-byte Vietnamese

ZZZZ      ZZZZ       0.260  4-byte Emoji (see comment above)

Old utf8mb4_unicode_ci (before the patch)

-- Warning up

SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;

DO BENCHMARK(10000000,strcmp('xxxx','xxxx'));

-- Benchmarking

SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci;

DO BENCHMARK(10000000,strcmp('aaaa','aaaa'));

DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa'));

DO BENCHMARK(10000000,strcmp('яяяя','яяяя'));

DO BENCHMARK(10000000,strcmp('ssss','ßß'));

DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ'));

DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ'));

MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaa','aaaa'));

Query OK, 0 rows affected (0.259 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa'));

Query OK, 0 rows affected (0.405 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('яяяя','яяяя'));

Query OK, 0 rows affected (0.384 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('ssss','ßß'));

Query OK, 0 rows affected (0.265 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ'));

Query OK, 0 rows affected (0.413 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ'));

Query OK, 0 rows affected (0.417 sec)

Summary

A         B          Time    % of general_ci      Comment

----      -----      ----    -------------------  -------

aaaa      aaaa       0.259   251   (259/103*100)  ASCII

aaaaaaaa  aaaaaaaa   0.405   371   (405/109*100)  ASCII

яяяя      яяяя       0.384   200   (384/192*100)  2-byte Cyrillic

ssss      ßß         0.265   N/A                  ASCII vs 2-byte Latin with expansion

ắắắắ      ắắắắ       0.414   155   (414/266*100)  3-byte Vietnamese

ZZZZ      ZZZZ       0.417   160   (417/260*100)  4-byte Emoji (see comment above)

New utf8mb4_unicode_ci (after the patch)

-- Warning up

SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;

DO BENCHMARK(10000000,strcmp('xxxx','xxxx'));

-- Benchmarking

SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci;

DO BENCHMARK(10000000,strcmp('aaaa','aaaa'));

DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa'));

DO BENCHMARK(10000000,strcmp('яяяя','яяяя'));

DO BENCHMARK(10000000,strcmp('ssss','ßß'));

DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ'));

DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ'));

MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaa','aaaa'));

Query OK, 0 rows affected (0.160 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa'));

Query OK, 0 rows affected (0.181 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('яяяя','яяяя'));

Query OK, 0 rows affected (0.177 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('ssss','ßß'));

Query OK, 0 rows affected (0.156 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ'));

Query OK, 0 rows affected (0.433 sec)

MariaDB [test]> DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ'));

Query OK, 0 rows affected (0.476 sec)

Summary

A        B        Time    % of utf8mb4_general_ci  Comment

----     ----     ----    -----------------------  -------

aaaa     aaaa     0.160   155  (160/103*100)       ASCII

aaaaaaaa aaaaaaaa 0.181   166 (181/109*100)        ASCII

яяяя     яяяя     0.177   92  (177/192*100)        2-byte Cyrillic

ssss     ßß       0.156   N/A                      ASCII vs 2-byte Latin with expansion

ắắắắ     ắắắắ     0.433   163 (433/266*100)        3-byte Vietnamese

ZZZZ     ZZZZ     0.476  182 (476/260*100)         4-byte Emoji (see comment above)

Full summary

utf8mb4_general_ci - old utf8mb4_unicode_ci - new utf8mb4_unicode_ci

A         B        % New/Old   OldTime  % Old/general_ci   NewTime % New/general_ci    Comment

----      -----    ---------   -------  -----------------  ------- -----------------  -------

aaaa      aaaa            62   0.259    251 (259/103*100)  0.160   155 (160/103*100)  ASCII

aaaaaaaa  aaaaaaaa        45   0.405    371 (405/109*100)  0.181   166 (181/109*100)  ASCII

яяяя      яяяя            46   0.384    200 (384/192*100)  0.177   98  (188/192*100)  2-byte Cyrillic

ssss      ßß              59   0.265    N/A                0.156   N/A                ASCII vs 2-byte Latin with expansion

ắắắắ      ắắắắ           105   0.414    155 (414/266*100)  0.433   162 (433/266*100)  3-byte Vietnamese

ZZZZ      ZZZZ           114   0.417    160 (417/260*100)  0.476   182 (476/260*100)  Emoji, see comment above

Obvervations:

Performance significantly improved:

On 4 byte ASCII strings the new implementation takes 62% of the old time (155% of utf8mb4_general_ci)
On 8 byte ASCII strings the new implementation takes 45% of the old time (166% of utf8mb4_general_ci)
On 4 character (8 byte) Cyrillic strings the new implementation takes 46% of the old time (98% of utf8mb4_general_ci)
On 'ssss' vs 'ßß' the new implementation take 59% of the old time

Performance slightly degraded:

On 4 character (4*3=12 byte) Vietnamese strings the new implementation takes 105% of the old time (and 162% of utf8mb4_general_ci)
On 4 characher (4*4=16 byte) Emoji strings the new implementations takes 114% of the old time (and 182% of utf8mb4_general_ci)

The slight slow-down on 3-byte and 4-byte characters is expected: it now tries to go the optimized way, then fails, then goes the old non-optimized way.

Alexander Barkov added a comment - 2021-12-15 15:54 - edited Benchmarking JIRA does not allow to put characters outside of BMP. So in the below text this character: U+1F44D THUMBS UP SIGN (_utf8 F09F918D) was replaced to Z. utf8mb4_general_ci (not changed - just for reference) -- Warning up SET NAMES utf8mb4 COLLATE utf8mb4_general_ci; DO BENCHMARK(10000000,strcmp( 'xxxx' , 'xxxx' )); -- Benchmarking SET NAMES utf8mb4 COLLATE utf8mb4_general_ci; DO BENCHMARK(10000000,strcmp( 'aaaa' , 'aaaa' )); DO BENCHMARK(10000000,strcmp( 'aaaaaaaa' , 'aaaaaaaa' )); DO BENCHMARK(10000000,strcmp( 'яяяя' , 'яяяя' )); DO BENCHMARK(10000000,strcmp( 'ắắắắ' , 'ắắắắ' )); DO BENCHMARK(10000000,strcmp( 'ZZZZ' , 'ZZZZ' )); MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaa','aaaa')); Query OK, 0 rows affected (0.107 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa')); Query OK, 0 rows affected (0.109 sec) MariaDB [test]> SET NAMES utf8mb4 COLLATE utf8mb4_general_ci; Query OK, 0 rows affected (0.000 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaa','aaaa')); Query OK, 0 rows affected (0.103 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('яяяя','яяяя')); Query OK, 0 rows affected (0.192 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ')); Query OK, 0 rows affected (0.266 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ')); Query OK, 0 rows affected (0.260 sec) Summary A B Time Comment ---- ----- ---- -------- aaaa aaaa 0.103 ASCII aaaaaaaa aaaaaaaa 0.109 ASCII яяяя яяяя 0.192 2-byte Cyrillic ắắắắ ắắắắ 0.266 3-byte Vietnamese ZZZZ ZZZZ 0.260 4-byte Emoji (see comment above) Old utf8mb4_unicode_ci (before the patch) -- Warning up SET NAMES utf8mb4 COLLATE utf8mb4_general_ci; DO BENCHMARK(10000000,strcmp( 'xxxx' , 'xxxx' )); -- Benchmarking SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci; DO BENCHMARK(10000000,strcmp( 'aaaa' , 'aaaa' )); DO BENCHMARK(10000000,strcmp( 'aaaaaaaa' , 'aaaaaaaa' )); DO BENCHMARK(10000000,strcmp( 'яяяя' , 'яяяя' )); DO BENCHMARK(10000000,strcmp( 'ssss' , 'ßß' )); DO BENCHMARK(10000000,strcmp( 'ắắắắ' , 'ắắắắ' )); DO BENCHMARK(10000000,strcmp( 'ZZZZ' , 'ZZZZ' )); MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaa','aaaa')); Query OK, 0 rows affected (0.259 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa')); Query OK, 0 rows affected (0.405 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('яяяя','яяяя')); Query OK, 0 rows affected (0.384 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('ssss','ßß')); Query OK, 0 rows affected (0.265 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ')); Query OK, 0 rows affected (0.413 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ')); Query OK, 0 rows affected (0.417 sec) Summary A B Time % of general_ci Comment ---- ----- ---- ------------------- ------- aaaa aaaa 0.259 251 (259/103*100) ASCII aaaaaaaa aaaaaaaa 0.405 371 (405/109*100) ASCII яяяя яяяя 0.384 200 (384/192*100) 2-byte Cyrillic ssss ßß 0.265 N/A ASCII vs 2-byte Latin with expansion ắắắắ ắắắắ 0.414 155 (414/266*100) 3-byte Vietnamese ZZZZ ZZZZ 0.417 160 (417/260*100) 4-byte Emoji (see comment above) New utf8mb4_unicode_ci (after the patch) -- Warning up SET NAMES utf8mb4 COLLATE utf8mb4_general_ci; DO BENCHMARK(10000000,strcmp( 'xxxx' , 'xxxx' )); -- Benchmarking SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci; DO BENCHMARK(10000000,strcmp( 'aaaa' , 'aaaa' )); DO BENCHMARK(10000000,strcmp( 'aaaaaaaa' , 'aaaaaaaa' )); DO BENCHMARK(10000000,strcmp( 'яяяя' , 'яяяя' )); DO BENCHMARK(10000000,strcmp( 'ssss' , 'ßß' )); DO BENCHMARK(10000000,strcmp( 'ắắắắ' , 'ắắắắ' )); DO BENCHMARK(10000000,strcmp( 'ZZZZ' , 'ZZZZ' )); MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaa','aaaa')); Query OK, 0 rows affected (0.160 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('aaaaaaaa','aaaaaaaa')); Query OK, 0 rows affected (0.181 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('яяяя','яяяя')); Query OK, 0 rows affected (0.177 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('ssss','ßß')); Query OK, 0 rows affected (0.156 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('ắắắắ','ắắắắ')); Query OK, 0 rows affected (0.433 sec) MariaDB [test]> DO BENCHMARK(10000000,strcmp('ZZZZ','ZZZZ')); Query OK, 0 rows affected (0.476 sec) Summary A B Time % of utf8mb4_general_ci Comment ---- ---- ---- ----------------------- ------- aaaa aaaa 0.160 155 (160/103*100) ASCII aaaaaaaa aaaaaaaa 0.181 166 (181/109*100) ASCII яяяя яяяя 0.177 92 (177/192*100) 2-byte Cyrillic ssss ßß 0.156 N/A ASCII vs 2-byte Latin with expansion ắắắắ ắắắắ 0.433 163 (433/266*100) 3-byte Vietnamese ZZZZ ZZZZ 0.476 182 (476/260*100) 4-byte Emoji (see comment above) Full summary utf8mb4_general_ci - old utf8mb4_unicode_ci - new utf8mb4_unicode_ci A B % New/Old OldTime % Old/general_ci NewTime % New/general_ci Comment ---- ----- --------- ------- ----------------- ------- ----------------- ------- aaaa aaaa 62 0.259 251 (259/103*100) 0.160 155 (160/103*100) ASCII aaaaaaaa aaaaaaaa 45 0.405 371 (405/109*100) 0.181 166 (181/109*100) ASCII яяяя яяяя 46 0.384 200 (384/192*100) 0.177 98 (188/192*100) 2-byte Cyrillic ssss ßß 59 0.265 N/A 0.156 N/A ASCII vs 2-byte Latin with expansion ắắắắ ắắắắ 105 0.414 155 (414/266*100) 0.433 162 (433/266*100) 3-byte Vietnamese ZZZZ ZZZZ 114 0.417 160 (417/260*100) 0.476 182 (476/260*100) Emoji, see comment above Obvervations: Performance significantly improved: On 4 byte ASCII strings the new implementation takes 62% of the old time (155% of utf8mb4_general_ci) On 8 byte ASCII strings the new implementation takes 45% of the old time (166% of utf8mb4_general_ci) On 4 character (8 byte) Cyrillic strings the new implementation takes 46% of the old time (98% of utf8mb4_general_ci) On 'ssss' vs 'ßß' the new implementation take 59% of the old time Performance slightly degraded: On 4 character (4*3=12 byte) Vietnamese strings the new implementation takes 105% of the old time (and 162% of utf8mb4_general_ci) On 4 characher (4*4=16 byte) Emoji strings the new implementations takes 114% of the old time (and 182% of utf8mb4_general_ci) The slight slow-down on 3-byte and 4-byte characters is expected: it now tries to go the optimized way, then fails, then goes the old non-optimized way.

Sergei Golubchik added a comment - 2022-06-18 19:01

It's in this branch: preview-10.10-uca14.

Sergei Golubchik added a comment - 2022-06-18 19:01 It's in this branch: preview-10.10-uca14 .

Lena Startseva added a comment - 2022-07-13 07:24

Environment:

Linux  5.13.0-52-generic #59-Ubuntu SMP  x86_64 x86_64 x86_64 GNU/Linux

memory         64GiB System Memory

processor      11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz

Summary

A	B	general_ci	utf8mb4_unicode_ci (old)	utf8mb4_unicode_ci (new)	% New/Old	% Old/general_ci	% New/general_ci
aaaa	aaaa	0.363	0.933	0.545	54.4	257.0	150.1
aaaaaaaa	aaaaaaaa	0.396	1.452	0.651	44.8	366.7	164.4
яяяя	яяяя	0.702	1.278	0.652	51.0	182.1	92.9
ssss	ßß	0.487	0.949	0.545	57.4	194.9	111,9
ắắắắ	ắắắắ	0.885	1.361	1.685	123.8	153.8	190.4
		0.665	1.132	1.529	135.0	170.2	229.9

On my laptop the result is a little less optimistic, but in line with expectations.

Lena Startseva added a comment - 2022-07-13 07:24 Environment: Linux 5.13.0-52-generic #59-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux memory 64GiB System Memory processor 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz Summary A B general_ci utf8mb4_unicode_ci (old) utf8mb4_unicode_ci (new) % New/Old % Old/general_ci % New/general_ci aaaa aaaa 0.363 0.933 0.545 54.4 257.0 150.1 aaaaaaaa aaaaaaaa 0.396 1.452 0.651 44.8 366.7 164.4 яяяя яяяя 0.702 1.278 0.652 51.0 182.1 92.9 ssss ßß 0.487 0.949 0.545 57.4 194.9 111,9 ắắắắ ắắắắ 0.885 1.361 1.685 123.8 153.8 190.4 0.665 1.132 1.529 135.0 170.2 229.9 On my laptop the result is a little less optimistic, but in line with expectations.

Lena Startseva added a comment - 2022-08-08 15:39

Ok to push

Lena Startseva added a comment - 2022-08-08 15:39 Ok to push

Rick James added a comment - 2023-05-17 23:25

As for "other" charsets (ucs2, etc), I suggest that can be very low on the priority list. I have not heard of anyone creating a table with such. Importing is an unrelated topic; I would encourage converting to utf8mb4 during importation, thereby avoiding the need for collation speedups.

Rick James added a comment - 2023-05-17 23:25 As for "other" charsets (ucs2, etc), I suggest that can be very low on the priority list. I have not heard of anyone creating a table with such. Importing is an unrelated topic; I would encourage converting to utf8mb4 during importation, thereby avoiding the need for collation speedups.

People

Assignee:: Alexander Barkov

Reporter:: Alexander Barkov

Votes:: 1 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2021-12-15 10:21

Updated:: 2023-10-03 07:25

Resolved:: 2022-08-10 19:47

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Improve UCA collation performance for utf8mb3 and utf8mb4

Details

Description

Performance improvement, level 1:

Performance improvement, level 2:

Other Unicode character sets

Attachments

Issue Links

Activity

Benchmarking

utf8mb4_general_ci (not changed - just for reference)

Old utf8mb4_unicode_ci (before the patch)

New utf8mb4_unicode_ci (after the patch)

Full summary

People

Dates

Git Integration