Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18226

Index cardinality is not synced across Galera cluster and stays at 0

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Not a Bug
    • Affects Version/s: 10.2
    • Fix Version/s: N/A
    • Environment:

      Description

      Hello,

      We have several MariaDB galera clusters on Debian 8. On all these clusters, we've noticed some pretty significant performance issues that appear to be related to the index cardinality being incorrect.

      I have managed to reproduce this issue. For this I have attached "data.sql" which has the table schema where I generated dummy data to reproduce it.

      Steps are like this:
      Setup:
      3 MariaDB 10.2 Galera nodes

      NODE 1:
      1. Log in to MySQL and `create database cardtest02;`
      2. Import the attached `data.sql` file into `cardtest02`: `mysql --defaults-file=/etc/mysql/mysql-root.cnf cardtest02 < ~/data.sql`
      3. Verify index cardinality:

      use cardtest02;
      show indexes from cardtest_tbl;                                                                                                                                                              
      +--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      | Table        | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
      +--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      | cardtest_tbl |          0 | PRIMARY  |            1 | id          | A         |         301 |     NULL | NULL   |      | BTREE      |         |               |
      +--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      1 row in set (0.00 sec)
      

      As you can see, in this case, on node 1, Cardinality is correct (301). However, if we now check this on another , for example node 2, we see the following:

      NODE 2:

      MariaDB [cardtest02]> show indexes from cardtest_tbl;
      +--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      | Table        | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
      +--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      | cardtest_tbl |          0 | PRIMARY  |            1 | id          | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
      +--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
      1 row in set (0.00 sec)
      

      Because of this issue, indexes aren't properly being used and the performance is very bad, especially on the larger tables.

      In addition, I have also noticed the `information_schema.tables` table being incorrect:

      NODE 1:

      MariaDB [(none)]> select * from information_schema.tables WHERE TABLE_NAME = 'cardtest_tbl' \G;
      *************************** 1. row ***************************
        TABLE_CATALOG: def
         TABLE_SCHEMA: cardtest02
           TABLE_NAME: cardtest_tbl
           TABLE_TYPE: BASE TABLE
               ENGINE: InnoDB
              VERSION: 10
           ROW_FORMAT: Compressed
           TABLE_ROWS: 301
       AVG_ROW_LENGTH: 81
          DATA_LENGTH: 24576
      MAX_DATA_LENGTH: 0
         INDEX_LENGTH: 0
            DATA_FREE: 0
       AUTO_INCREMENT: 1324
          CREATE_TIME: 2019-01-14 12:37:20
          UPDATE_TIME: 2019-01-14 12:37:20
           CHECK_TIME: NULL
      TABLE_COLLATION: utf8_general_ci
             CHECKSUM: NULL
       CREATE_OPTIONS: row_format=COMPRESSED
        TABLE_COMMENT: assessment table retrofitted from MySQL
      1 row in set (0.01 sec)
      

      NODE 2:

      MariaDB [cardtest02]> select * from information_schema.tables WHERE TABLE_NAME = 'cardtest_tbl' \G;
      *************************** 1. row ***************************
        TABLE_CATALOG: def
         TABLE_SCHEMA: cardtest02
           TABLE_NAME: cardtest_tbl
           TABLE_TYPE: BASE TABLE
               ENGINE: InnoDB
              VERSION: 10
           ROW_FORMAT: Compressed
           TABLE_ROWS: 0
       AVG_ROW_LENGTH: 0
          DATA_LENGTH: 8192
      MAX_DATA_LENGTH: 0
         INDEX_LENGTH: 0
            DATA_FREE: 0
       AUTO_INCREMENT: 1324
          CREATE_TIME: 2019-01-14 12:37:20
          UPDATE_TIME: 2019-01-14 12:37:20
           CHECK_TIME: NULL
      TABLE_COLLATION: utf8_general_ci
             CHECKSUM: NULL
       CREATE_OPTIONS: row_format=COMPRESSED
        TABLE_COMMENT: assessment table retrofitted from MySQL
      1 row in set (0.00 sec)
      

      A temporary workaround to fix this is running an `analyze table cardtest_tbl`; on one of the nodes and it is synced across all of them automatically after that, but it is of course only temporary. Even if I add more rows later (with the same .sql file for example when I only change IDs and skip the drop table if exists) the cardinality is only ever updated on the node where I added the rows and not on node 2 / 3.

      I also tried to see what happens if I add new rows on node 2 instead after the initial dump, and it does increase the cardinality there with the amount of rows inserted on node 2, but it never reflects anywhere near the actual amount of rows.
      Interesting to note is the following:
      1. Import data.sql on node 1, which recreates an empty table and adds rows, bringing cardinality to 301 in node 1 and 0 on node 2/3
      2. Run analyze table on on any node, resulting in a synchronisation of the cardinality; all nodes now have 301
      3. Running single queries on node 1 and/node 2 do increase cardinality correctly by 1:

      NODE 1:
      INSERT INTO `cardtest_tbl` VALUES (3000,1,'','',1,1466430455,1471454450,NULL,10.00000,1,NULL,'','',NULL,NULL,1,NULL,NULL);
      NODE 2:
      INSERT INTO `cardtest_tbl` VALUES (3001,1,'','',1,1466430455,1471454450,NULL,10.00000,1,NULL,'','',NULL,NULL,1,NULL,NULL);
      

      After this, cardinality is 303 on all nodes.
      3. Import data2.sql (see attachment) through node 1, bringing cardinality to 604 on node 1, and back to 301 on node 2

      I have also attached mysql_settings.txt which contains the output of `show variables`, which may be helpful.

      Does anyone have any idea of how to fix / prevent this? Is it a bug or intended behaviour we're unaware of? Running analyze table on all tables is not something we like to do, especially as it is locking on the entire cluster and usually kills the entire cluster (which is another issue entirely I believe we can work around with by using wsrep_desync and RSU options)

      Cheers,
      Niels

        Attachments

        1. data.sql
          28 kB
        2. data2.sql
          28 kB
        3. galera_index_cardinality.cnf
          0.6 kB
        4. galera_index_cardinality.test
          27 kB
        5. mysql_settings.txt
          79 kB

          Issue Links

            Activity

              People

              Assignee:
              jplindst Jan Lindström
              Reporter:
              nielsh Niels Hendriks
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: