Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19670

MyRocks: key lookups into deleted data are very slow

    XMLWordPrintable

Details

    Description

      This is based on a report from a user.
      Index lookups become very slow after a bulk delete operation.

      Steps to reproduce with a totally synthetic dataset:

      create table ten(a int);
      insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
       
      create table one_k(a int);
      insert into one_k select A.a + B.a* 10 + C.a * 100 from ten A, ten B, ten C;
      

      create table t1 (
        pk bigint primary key,
        a bigint,
        filler varchar(32),
        key(a)
      )engine=rocksdb;
       
      set rocksdb_max_row_locks=1000*1000*1024;
      insert into t1 select 
        A.a+ 1000*B.a,  A.a+ 1000*B.a, A.a+ 1000*B.a 
      from 
        one_k A, one_k B;
      set global rocksdb_force_flush_memtable_now=1;
      

      The query (originally it was a join):

      explain 
      select * from t1 
      where 
        a in (
      100,101,102,103,104,105,106,107,108,109,
      110,111,112,113,114,115,116,117,118,119,
      120,121,122,123,124,125,126,127,128,129,
      130,131,132,133,134,135,136,137,138,139,
      140,141,142,143,144,145,146,147,148,149,
      150,151,152,153,154,155,156,157,158,159,
      160,161,162,163,164,165,166,167,168,169,
      170,171,172,173,174,175,176,177,178,179,
      180,181,182,183,184,185,186,187,188,189,
      190,191,192,193,194,195,196,197,198,199,200);
      

      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                 |
      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      |    1 | SIMPLE      | t1    | range | a             | a    | 9       | NULL |  293 | Using index condition |
      +------+-------------+-------+-------+---------------+------+---------+------+------+-----------------------+
      

      The query itself returns 100 rows in 0.01 sec.

      Now, let's do the same in the transaction:

      begin;
      delete from t1 where a between 100 and 100+99000;
      # Query OK, 99001 rows affected (0.67 sec)
      delete from t1 where a between 99000 and 200000;
      # Query OK, 100900 rows affected (0.73 sec)
      delete from t1 where a between 20000 and 300000;
      # Query OK, 100000 rows affected (4.11 sec)
      

      Re-running the query:

      explain 
      select * from t1 
      where 
        a in (
      100,101,102,103,104,105,106,107,108,109,
      110,111,112,113,114,115,116,117,118,119,
      120,121,122,123,124,125,126,127,128,129,
      130,131,132,133,134,135,136,137,138,139,
      140,141,142,143,144,145,146,147,148,149,
      150,151,152,153,154,155,156,157,158,159,
      160,161,162,163,164,165,166,167,168,169,
      170,171,172,173,174,175,176,177,178,179,
      180,181,182,183,184,185,186,187,188,189,
      190,191,192,193,194,195,196,197,198,199,200);
      

      The EXPLAIN is the same, but the query now takes 1.25 sec.
      (this is a reduced testcase. In the original report, the join query couldn't finish at all)

      Attachments

        Activity

          People

            psergei Sergei Petrunia
            psergei Sergei Petrunia
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.