[MDEV-6529] optimize VARCHAR temp storage during EITS ANALYZE Created: 2014-08-05 Updated: 2024-02-06 |
|
| Status: | Confirmed |
| Project: | MariaDB Server |
| Component/s: | Optimizer |
| Fix Version/s: | 11.5 |
| Type: | New Feature | Priority: | Critical |
| Reporter: | Sergei Petrunia | Assignee: | Sergei Petrunia |
| Resolution: | Unresolved | Votes: | 3 |
| Labels: | eits | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Description |
|
as a separate issue: Motivated by CSC#7897, I've tried to see how much space can ANALYZE use. (So far my understanding was that max. temp dir usage would be rougly equal to size of the table w/o indexes). Let's try this:
The table has 1M rows. Rows in the VARCHAR(100) column occupy less than The table on disk:
Now,
It's writing 300 bytes, the unpacked length. In total, we get:
300MBit = 286 M, i.e. ANALYZE may require much more space than is occupied by the table (44M) |
| Comments |
| Comment by Elena Stepanova [ 2015-11-05 ] | |||||||||||||
|
| |||||||||||||
| Comment by Sergei Petrunia [ 2015-11-11 ] | |||||||||||||
|
Need to check with igor about the status of this. | |||||||||||||
| Comment by Sergei Petrunia [ 2015-11-11 ] | |||||||||||||
|
There is a change in MySQL 5.7 that could be used for this. Igor has started merging it, but didn't finish. | |||||||||||||
| Comment by Sergei Petrunia [ 2015-11-27 ] | |||||||||||||
|
The change I was talking about in the previous comment is
| |||||||||||||
| Comment by Abdul Manaf [ 2019-04-22 ] | |||||||||||||
|
Team - is there any plan to fix this in coming versions ? | |||||||||||||
| Comment by Sergei Petrunia [ 2019-04-22 ] | |||||||||||||
|
navodaya05 this specific deficiency is not fixed. But there is a feature in MariaDB 10.4 which allows to have a workaround. It's "Bayesian sampling for histograms": https://mariadb.com/kb/en/library/server-system-variables/#analyze_sample_percentage The default is 100, which means "use the whole dataset", like pre-10.4 versions did. If you set it to a smaller value, then the histogram will be built from a sample, and thus require less space. | |||||||||||||
| Comment by Michael Widenius [ 2022-03-14 ] | |||||||||||||
|
Shouldn't | |||||||||||||
| Comment by Mason Sharp [ 2023-10-13 ] | |||||||||||||
|
Is hyperloglog a potential solution here for VARCHAR columns to determine selectivity and not use much space? | |||||||||||||
| Comment by Michael Widenius [ 2023-11-20 ] | |||||||||||||
|
We have an old patch, MDEV-21829, for reducing the size of VARCHAR as part of UNIQUE originally created by Varun that I am involved in fixing. | |||||||||||||
| Comment by Mason Sharp [ 2024-01-22 ] | |||||||||||||
|
Linked to MDEV-32472 |