[MDEV-27154] allkeys.txt based tests for Unicode-4.0.0 and 5.2.0 Created: 2021-12-02 Updated: 2021-12-20 Resolved: 2021-12-02 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Character Sets, Tests |
| Fix Version/s: | 10.8.0 |
| Type: | Task | Priority: | Major |
| Reporter: | Alexander Barkov | Assignee: | Alexander Barkov |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
Let's add MTR tests which will load the default weight table allkeys.txt from Unicode-4.0.0 and Unicode-5.2.0 to check that the collations utf8mb4_unicode_ci and utf8mb4_unicode_520_ci work as expected. These new tests will cover all characters in the range U+0000..U+10FFFF and will make sure that nothing breaks after upcoming changes soon. The idea is to calculate weights for every Unicode character into two ways: 1. Using WEIGHT_STRING() - this is the weight that the MariaDB collation returns for the character. Both calculated values must produce equal results for every character. The only exception character is "FDFA ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM". It has 18 weights in allkeys.txt, while MariaDB has a limit of 8 weights per character. |
| Comments |
| Comment by Alexander Barkov [ 2021-12-20 ] | ||||||||||||||||||||||
|
It's also repeatable with mtr --valgrind run with this smaller script:
|