[MDEV-21922] Allow packing addon fields even if they don't honour max_length_for_sort_data Created: 2020-03-12  Updated: 2020-03-17  Resolved: 2020-03-15

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.5
Fix Version/s: 10.5.2

Type: Bug Priority: Major
Reporter: Varun Gupta (Inactive) Assignee: Varun Gupta (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-6915 Allow packed keys and packed values o... Closed

 Description   

When we pick the strategy of using addon fields while doing filesort, we try to check if one can pack the addon fields or not.
This is done inside the function try_to_pack_addons()
Inside this function, we have

  const uint sz= Addon_fields::size_of_length_field;
  if (rec_length + sz > max_length_for_sort_data)
    return;

This means if the number of bytes becomes greater than max_length_for_sort_data by adding the size of length field for
addons, then we should not pack.
I think we can easily lift this limitation because it would be beneficial to pack even in such cases, so if the addon fields are picked (they already honour the max_length_for_sort_data in the function filesort_uses_addons()) then we should try to pack them in such cases.

This issue was found during the performance testing for packed sort keys MDEV-21580



 Comments   
Comment by Varun Gupta (Inactive) [ 2020-03-12 ]

The slowdown in peformance was seen while benchmarking MDEV-21580.
This is where one can check the slowdown for varchar(200) utf8

https://jira.mariadb.org/browse/MDEV-21784?focusedCommentId=146126&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-146126

Comment by Varun Gupta (Inactive) [ 2020-03-12 ]

So without the patch

MariaDB [benchmarks]> select table_size, test_time_ms from test_runs;
+------------+--------------+
| table_size | test_time_ms |
+------------+--------------+
|      25000 |           43 |
|      50000 |           85 |
|     100000 |          215 |
|     500000 |         1077 |
|    1000000 |         2440 |
|    2000000 |         4865 |
|    4000000 |        11245 |
+------------+--------------+

With the patch where we pack addon fields:

MariaDB [benchmarks]> select table_size, test_time_ms from test_runs;
+------------+--------------+
| table_size | test_time_ms |
+------------+--------------+
|      25000 |           24 |
|      50000 |           57 |
|     100000 |          111 |
|     500000 |          548 |
|    1000000 |         1213 |
|    2000000 |         2465 |
|    4000000 |         4868 |
+------------+--------------+

Comment by Varun Gupta (Inactive) [ 2020-03-12 ]

Patch
http://lists.askmonty.org/pipermail/commits/2020-March/014209.html

Comment by Sergei Petrunia [ 2020-03-12 ]

I'm debugging this example:

create table t20 (a varchar(200) character set utf8, b int);
insert into t20 select seq,seq from seq_1_to_10;
select * from t20 order by a;

in filesort_use_addons():

  return *length + sortlength <
         table->in_use->variables.max_length_for_sort_data;

Here, sortlength=401, *length=609.

401 is the length of strxfrm() image of the column. That is, we assume non-packed sort keys here.
(note that try_to_pack_sortkeys() hasn't been called yet and we don't actually know)

Then, execution enters Sort_param::try_to_pack_sortkeys() and at the end of that function it sets:

  rec_length= sort_length + addon_length;

Here, sort_length = 607, addon_length = 609.

Then, execution enters Sort_param::try_to_pack_addons(), where the check

  if (rec_length + sz > max_length_for_sort_data) {

fails, because rec_length=1216 while max_length_for_sort_data=1024 (sz=2 but that doesn't matter).

Comment by Sergei Petrunia [ 2020-03-12 ]

Does filesort_use_addons() do wrong computation?

One may argue that this computation is fine, because:

use of unpacked key length in filesort_use_addons() is fine, because unpacked key length is a good upper bound or packed key length. (One can expect the code to not switch to packed keys if they are bigger than unpacked ones).

The second check fails, because we are using maximum possible length of the packed sort key there. For utf8_general_ci, the original string form can be longer than its mem-comparable form (note that this is rarely achieved in practice. Also note that for other collations, e.g. utf8_unicode_ci, this is not the case).

Generated at Thu Feb 08 09:10:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.