[MDEV-5174] Filesort excessively uses disk space for TINYBLOB and TINYTEXT columns Created: 2013-10-23  Updated: 2017-11-05

Status: Open
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.4, 5.5.33, 5.1.67, 5.2.14, 5.3.12
Fix Version/s: 10.2

Type: Bug Priority: Minor
Reporter: Alexander Barkov Assignee: Alexander Barkov
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Filesort uses Field::sort_length() to calculate space needed to sort the field.

In case of BLOB and TEXT fields and all their
TINY, MEDIUM, LONG variations the length is calculated as follows:

uint32 Field_blob::sort_length() const
{
return (uint32) (current_thd->variables.max_sort_length +
(field_charset == &my_charset_bin ? 0 : packlength));
}

The default value of max_sort_length is 1024.

This is bad for TINYBLOB and TINYTEXT.

  • It should be enough to use 256 bytes to sort TINYBLOB,
  • It should be enough to use 256*strxfrm_multiply bytes to sort TINYTEXT.
    (where strxfrm_multiply is 1 for many collations, which makes 256 bytes again)

So TINYBLOB (and TINYTEXT in most cases) use 4 times more space
for sorting than it's actually needed. That should affect performance
very negatively.


Generated at Thu Feb 08 07:02:15 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.