Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20568

Tokudb, very strange, substantial performance drop

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Minor
    • Resolution: Unresolved
    • 10.3.12, 10.4.7
    • None
    • None
    • FC 30,
      Debian 9 Vmware

    Description

      Hi Guys, during some testing I found a very strange performance issue with toku. Basically when doing a select with WHERE date IN (....) - tokudb is about 4 times as fast to run the select for multiple days than for single day.

      Checked this on innodb as well and the EXPLAIN format=json is same as for toku, but execution times are exactly the same. So this problem seems to be limited to tokudb only.

      To reproduce it you'll need 2 queries
      A. SELECT /SQL_NO_CACHE/ sum(impressions) FROM stats_tr_creative_2001 where date IN ('2019-08-01', '2030-08-02')

      B. SELECT /SQL_NO_CACHE/ sum(impressions) FROM stats_tr_creative_2001 where date IN ('2019-08-01')

      Always run query A first and A is always faster for us than query B, even with cold keycache on A and hot keycache on B ... A would be ~2 times faster. When keycache status is same we have around x4 difference (eg. 0.6s for A and 2.8s on B)

      Reporting as minor because workaround is just to include some non-existing date in the past or in the future, but maybe fixing it would improve toku performance a lot in other scenarios.

      For query A.

      {
        "query_block": {
          "select_id": 1,
          "table": {
            "table_name": "stats_tr_creative_2001",
            "partitions": ["p1"],
            "access_type": "ref",
            "possible_keys": ["raw_stats_lookup_id", "lookup_date"],
            "key": "raw_stats_lookup_id",
            "key_length": "3",
            "used_key_parts": ["date"],
            "ref": ["const"],
            "rows": 771332,
            "filtered": 100
          }
        }
      }
      

      For query B.

      {
        "query_block": {
          "select_id": 1,
          "table": {
            "table_name": "stats_tr_creative_2001",
            "partitions": ["p1", "p2"],
            "access_type": "range",
            "possible_keys": ["raw_stats_lookup_id", "lookup_date"],
            "key": "raw_stats_lookup_id",
            "key_length": "3",
            "used_key_parts": ["date"],
            "rows": 771335,
            "filtered": 100,
            "attached_condition": "stats_tr_creative_2001.`date` in ('2019-08-01','2030-08-02')"
          }
        }
      }
      

      Table struct:

      CREATE TABLE `stats_tr_creative_2001` (
        `network_id` smallint(6) NOT NULL,
        `creative_id` int(11) NOT NULL,
        `stats_tr_value_id_domain` int(11) NOT NULL,
        `stats_tr_value_id` int(11) NOT NULL,
        `date` date NOT NULL,
        `impressions` int(11) NOT NULL,
        UNIQUE KEY `raw_stats_lookup_id` (`date`,`network_id`,`creative_id`,`stats_tr_value_id_domain`,`stats_tr_value_id`) USING BTREE,
        KEY `lookup_date` (`date`,`stats_tr_value_id_domain`)
      ) ENGINE=TokuDB DEFAULT CHARSET=ascii
       PARTITION BY HASH (to_days(`date`) DIV 20)
      PARTITIONS 5;
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            pslawek83 Slawomir Pryczek
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.