[MDEV-6279] Possible performance issue with semi-join (TokuDB's records_in_range) Created: 2014-05-29 Updated: 2020-10-28 Resolved: 2020-10-28 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - TokuDB |
| Affects Version/s: | 5.5.37, 10.0.11 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Sergei Petrunia | Assignee: | Sergei Petrunia |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | optimizer, records_in_range | ||
| Issue Links: |
|
||||||||
| Description |
|
This is a separate issue to investigate possible performance degradation with semi-join as reported in Summary of data from comments and uploaded attachments for MySQL_Myisam (reported response time: 1 sec)
Maria_Myisam_Optimizer_off (reported resp. time 1 second)
Maria_Myisam_Optimizer_On (reported resp. time 23 seconds)
Maria_Toku_Optimizer_Off (reported resp. time 10 sec)
Maria_Toku_Optimizer_On (reported resp. time 5 sec)
|
| Comments |
| Comment by Sergei Petrunia [ 2014-05-29 ] | |||||||||||||||
|
On the question why Maria_Toku_Optimizer_Off is so much slower than Maria_Myisam_Optimizer_off. Comparing the query plans:
The join orders are the same.
I suspect TokuDB returns a very optimistic records_in_range estimate for wh_tagged_attr_fbk_data_f. If one really could read just 8427 rows from that table, we would have picked ref access with MyISAM. Sivaram_d, can you try the query with "FROM wh_tag_d" replaced with "FROM wh_tag_d USE INDEX()". TokuDB, optimizer_switch='semijoin=off' ? | |||||||||||||||
| Comment by Sivaram Dandibhotla [ 2014-05-29 ] | |||||||||||||||
|
Replacing "wh_tag_d" with "wh_tag_d USE INDEX()" didn't make any difference, it still takes approximately 10 seconds. | |||||||||||||||
| Comment by Sergei Petrunia [ 2014-06-02 ] | |||||||||||||||
|
Thanks for the answer. I am sorry, I have made a typo - we need "USE INDEX() not for wh_tag_d, but for wh_tagged_attr_fbk_data_f (as this is the table for which the query plan is different). Could you please try the query with "FROM wh_tagged_attr_fbk_data_f" replaced with "FROM wh_tagged_attr_fbk_data_f USE INDEX()". TokuDB, optimizer_switch='semijoin=off' ? | |||||||||||||||
| Comment by Sivaram Dandibhotla [ 2014-06-03 ] | |||||||||||||||
|
Now the response comes back in less than second as compared to approximately 9 seconds without this hint. Looks like with this hint optimizer is doing a full table scan but still faster even though the index that is used earlier (without this hint) is the right index to be used for the join condition. Can you please explain this behavior. Earlier we noticed that when we use "index" hints optimizer behavior is unpredictable, performance varies for different data sets of same size. Hence want to fully understand the risk involved in using this hint. Thanks Siva | |||||||||||||||
| Comment by Sergei Petrunia [ 2014-06-03 ] | |||||||||||||||
If using the index is slower, then it's not the right index to be used It is not always faster to use an index. Using an index is faster only if you can read fewer records. If you read nearly the same number of records, scanning through an index is typically slower than doing a full table scan. Let's look at the last line of the diff posted above:
With MyISAM, the optimizer decided to do a full scan and scan 842779 rows.
which shows that query execution has read 842K rows in total. All other tables in the query are smaller than that, so I conclude that these 842K rows were read from wh_tagged_attr_fbk_data_f. That is, TokuDB has grossly underestimated the number of rows it will have to read through index WH_TAGGED_ATTR_FBK_DATA_F_F3. | |||||||||||||||
| Comment by Sergei Petrunia [ 2014-06-03 ] | |||||||||||||||
|
This needs attention of a TokuDB developer. prohaska7, here we have a case where the only explanation of what's going on is that ha_toku::records_in_range() returns a value that is 100x smaller than the reality. Is it a known problem with TokuDB? Are there any workarounds? (e.g. will ANALYZE TABLE help?) | |||||||||||||||
| Comment by Sivaram Dandibhotla [ 2014-06-16 ] | |||||||||||||||
|
Will this fix be available in 10.0.13? Please confirm. | |||||||||||||||
| Comment by Sergei Petrunia [ 2014-06-17 ] | |||||||||||||||
|
Since the problem seems to be in TokuDB, we depend on Tokutek for the fix. I'll ping them about this. | |||||||||||||||
| Comment by Sergei Petrunia [ 2014-06-18 ] | |||||||||||||||
|
Ok, TokuDB team will start looking into this next week. Since we don't yet know any details about the problem, it is not yet possible to tell whether 10.0.13 will have a fix for this. | |||||||||||||||
| Comment by Sergei Petrunia [ 2020-10-28 ] | |||||||||||||||
|
Won't fix as TokuDB is not shipped in the earlier versions |