Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13372

Optimizer creates very bad execution plan for tokudb on random

    XMLWordPrintable

Details

    Description

      Hi Guys, im having huge issue with query optimizer it seems it makes it impossible to finish a very simple query using tokudb, with some basic joins... basically not only this query is affected but it seems it affects all queries on these 2 tables that are using >=2 JOINS
      (im doing joins only between these two)

      CREATE TABLE `stats_traffic_value` (
        `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
        `values` varchar(2048) NOT NULL,
        `_last_used` date NOT NULL,
        PRIMARY KEY (`id`),
        UNIQUE KEY `values` (`values`)
      ) ENGINE=TokuDB AUTO_INCREMENT=4249003 DEFAULT CHARSET=ascii;
      

      CREATE TABLE `stats_traffic` (
        `date` date NOT NULL,
        `network_id` int(11) NOT NULL,
        `publisher_id` varchar(255) NOT NULL,
        `zone` varchar(255) NOT NULL,
        `language` char(2) NOT NULL,
        `country` char(2) NOT NULL,
        `stv_domain` int(11) NOT NULL,
        `stv_tag` int(11) NOT NULL,
        `stv_tag_secondary` int(11) NOT NULL,
        `stv_gender` int(11) NOT NULL,
        `stv_looking_for` int(11) NOT NULL,
        `device` varchar(20) NOT NULL,
        `_sandbox` varchar(255) NOT NULL,
        `impressions` int(11) NOT NULL DEFAULT '0',
        `unique` int(11) NOT NULL,
        `free_impressions` int(11) NOT NULL DEFAULT '0',
        `free_unique` int(11) NOT NULL,
        `clicks` int(11) NOT NULL,
        `clicks_total` int(11) NOT NULL,
        `conversions` int(11) NOT NULL,
        `conversion_value` decimal(10,3) NOT NULL,
        `spending` decimal(10,4) NOT NULL,
        `earnings` decimal(12,6) NOT NULL,
        `_key` binary(27) NOT NULL,
        UNIQUE KEY `_key` (`date`,`_key`) USING HASH
      ) ENGINE=TokuDB DEFAULT CHARSET=ascii
      /*!50100 PARTITION BY HASH (TO_DAYS(date) DIV 20)
      PARTITIONS 5 */;
      

      Sample Query:

      SELECT    	
      `tf`.`network_id` as network,
      `tf`.`publisher_id`,
      `tf`.`country`,
      `tf`.`zone`,
      `tv2`.`value` AS `tag_secondary`,	
      `tv`.`value` AS `tag`,
      		
      `tf`.`device` as `device`,
      `tf`.`_sandbox` as `sandbox`,
      		      
      SUM(`impressions`) as impressions,          
      SUM(`free_impressions`) as free_impressions
      FROM `pstats`.`stats_traffic` `tf`
      INNER JOIN `pstats`.`stats_traffic_value_lookup` `tv` ON `tv`.`stats_traffic_value_id` = `tf`.`stv_tag`
      INNER JOIN `pstats`.`stats_traffic_value_lookup` `tv2` ON `tv2`.`stats_traffic_value_id` = `tf`.`stv_tag_secondary`
        
      WHERE 1=1 AND date = '2017-07-21' AND `tf`.`_sandbox` = '' AND
      `tv`.`value`='gay' AND country='us'	AND `tf`.network_id = 51
      		
      GROUP BY `publisher_id`, `zone`, `device`, `_sandbox`, 
      stv_gender, `tv2`.`value`, `tv`.value, stv_looking_for
      

      Now there are 2 execution plans which are made, seems randomly... one is taking about 3 seconds to finish, the second one never finishes...

      Good Plan:
      https://www.screencast.com/t/rqBCtKYIn

      Bad Plan:
      https://www.screencast.com/t/9Mfh9EOk

      Now, the main table has about 40 million rows, good plan results AFAIK in picking about 125k rows by DATE index, and then simple join with lookup tables (using PRI KEY on IDs) so numbers are converted to string. Very simple i treat this second table as dictionary... query takes about 2-4 seconds...

      For bad plan not sure what it's trying to do as i read it's trying to use index of table used only for group by as covering index... basically due to this strategy it now stars to run for 2'nd hour, with "Queried about 1170790000 rows"

      When i remove the second join (it always works good... attached as query1-join). Any idea how can i fix that or debug, so it'll always "start" with the main table filter by date and do ref-joins? Thanks

      Attachments

        1. bad-plan.png
          bad-plan.png
          45 kB
        2. good-plan.png
          good-plan.png
          43 kB
        3. query-1join.png
          query-1join.png
          48 kB

        Issue Links

          Activity

            People

              alice Alice Sherepa
              pslawek83 Slawomir Pryczek
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.