[MDEV-26572] Improve simple multibyte collation performance on the ASCII range Created: 2021-09-08 Updated: 2023-02-23 Resolved: 2021-09-13 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Character Sets |
| Fix Version/s: | 10.7.0 |
| Type: | Task | Priority: | Major |
| Reporter: | Alexander Barkov | Assignee: | Sergei Golubchik |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Description |
| Comments |
| Comment by Alexander Barkov [ 2021-09-13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TestingTests were done with help a standalone bechmarking program calling cs->cset->strnncollsp() in a loop. The test data is attached to this issue as a file all.txt. The test data set included strings with different lengths (1,2,3,4,8 and 16 characters) and different repertoires:
The benchmark program was run two times: before the patch and after the patch. The numbers in the tables below mean the following:
CHAR_LENGTH>=4In tests where both strings have char_length>=4 the benchmark program demonstrated the following average time difference (old time divided to new time):
CHAR_LENGTH>=16On long strings with CHAR_LENGTH>=16 the patch demonstrates the best performance improvement on the ascii, lat12 and lat13 repertoirs:
CHAR_LENGTH<4On short strings optimization was not done. The expected degradation should not be more than 5%.
Microbenchmark commentsIn all test results we can observe some noise on top of the actual performance changes directly caused by the changes in the code. The noise is caused by the fact that after changes in one function, the linker can change the order of all functions in the object file (and in the final binary), and this can visibly affect the performance of every function handling an individual collation (up to 20%). During run time, the closer a function resides in RAM to the benchmark loop - the faster it works. It relates to CPU caches. The noise can be different in the server (instead of the standalone program). So in addition to the individual numbers per collations, an average performance improvement on all collations is also important
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Alexander Barkov [ 2021-09-13 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
serg, please have a look into the patch with your review suggestions addressed: |