[MDEV-17511] Improve performance for ORDER BY with a CHAR(N) CHARACTER SET utf8_unicode_ci Created: 2018-10-21 Updated: 2018-10-21 Resolved: 2018-10-21 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Character Sets |
| Fix Version/s: | 10.4.0 |
| Type: | Task | Priority: | Major |
| Reporter: | Alexander Barkov | Assignee: | Alexander Barkov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Description |
|
Note, this problem is repeatable with all UCA collations with PAD SPACE attribute. This MDEV uses utf8_uncide_ci as an example for explanation. There is a bottleneck in these functions:
Generating weights for trailing spaces (which almost always present in case of CHAR) seems to be CPU hungry. It should be faster to trip trailing spaces in my_uca_strnxfrm_no_contractions_utf8mb3() before calling my_uca_strnxfrm_onelevel_internal_no_contractions_utf8mb3(). If we because of this change return a too short key, the caller will append weights for implicit spaces anyway, up to the desired key size. This will effectively generate exactly the same sortable key result. Appending weights for implicit spaces is much less CPU hungry that a loop with scanner_next calls. |
| Comments |
| Comment by Alexander Barkov [ 2018-10-21 ] | |||||||||||||||||||||||||||||
|
|