Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33351

Corruption of indexes on generated virtual column(s) on an optimistic parallel replica

    XMLWordPrintable

Details

    Description

      Set up a replication, for example, like between this two instances on the same box:

      openxs@ao756:~/dbs/maria10.6$ ps aux | grep maria
      openxs   1831742  8.7  2.9 1546012 111544 ?      Sl   січ31 100:46 /home/openxs/dbs/maria10.6/bin/mariadbd --no-defaults --basedir=/home/openxs/dbs/maria10.6 --datadir=/home/openxs/node1 --plugin-dir=/home/openxs/dbs/maria10.6/lib/plugin --log-bin --binlog_format=row --server_id=1 --log-error=/home/openxs/node1/ao756.err --pid-file=ao756.pid --socket=/tmp/node1.sock --port=3306
      openxs   1926056 41.0  2.8 1539216 109840 pts/2  Sl   10:55  56:25 /home/openxs/dbs/maria10.6/bin/mariadbd --no-defaults --basedir=/home/openxs/dbs/maria10.6 --datadir=/home/openxs/node2 --plugin-dir=/home/openxs/dbs/maria10.6/lib/plugin --log-bin --server_id=2 --slave_parallel_threads=4 --slave_parallel_max_queued=512KB --log-error=/home/openxs/node2/ao756.err --pid-file=ao756.pid --socket=/tmp/node2.sock --port=3307
      

      Note that slave_parallel_threads should be > 1, larger slave_parallel_max_queued value than default also seems to help to hit the problem faster.

      Then, on a primary node create this table and insert a single row in it:

      openxs@ao756:~/dbs/maria10.6$ bin/mysql --socket=/tmp/node1.sock test
      Reading table information for completion of table and column names
      You can turn off this feature to get a quicker startup with -A
       
      Welcome to the MariaDB monitor.  Commands end with ; or \g.
      Your MariaDB connection id is 108187
      Server version: 10.6.17-MariaDB-log Source distribution
       
      Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
       
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
       
      MariaDB [test]> select @@binlog_format;
      +-----------------+
      | @@binlog_format |
      +-----------------+
      | ROW             |
      +-----------------+
      1 row in set (0,000 sec)
       
      MariaDB [test]> show create table tts\G
      *************************** 1. row ***************************
             Table: tts
      Create Table: CREATE TABLE `tts` (
        `id` int(11) NOT NULL,
        `value` longtext NOT NULL CHECK (json_valid(`value`)),
        `ts` bigint(20) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.creationTS')),'null')) VIRTUAL,
        `dt` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.executionDate')),'null')) VIRTUAL,
        `c1` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c1')),'null')) VIRTUAL,
        `c2` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c2')),'null')) VIRTUAL,
        `c3` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c3')),'null')) VIRTUAL,
        `c4` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c4')),'null')) VIRTUAL,
        `c5` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c5')),'null')) VIRTUAL,
        `c6` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c6')),'null')) VIRTUAL,
        `c7` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c7')),'null')) VIRTUAL,
        `c8` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c8')),'null')) VIRTUAL,
        `c9` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c9')),'null')) VIRTUAL,
        `c10` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c10')),'null')) VIRTUAL,
        `c12` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c11')),'null')) VIRTUAL,
        `c11` varchar(15) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c12')),'null')) VIRTUAL,
        `c0` varchar(150) GENERATED ALWAYS AS (nullif(json_unquote(json_extract(`value`,'$.c0')),'null')) VIRTUAL,
        PRIMARY KEY (`id`),
        KEY `ix_ts` (`ts`),
        KEY `ix_dt` (`dt`),
        KEY `c1` (`c1`),
        KEY `c2` (`c2`),
        KEY `c3` (`c3`),
        KEY `c4` (`c4`),
        KEY `c5` (`c5`),
        KEY `c6` (`c6`),
        KEY `c7` (`c7`),
        KEY `c8` (`c8`),
        KEY `c9` (`c9`),
        KEY `c10` (`c10`),
        KEY `c11` (`c11`),
        KEY `c12` (`c12`),
        KEY `ix_c1_ts` (`c1`,`ts`),
        KEY `ix_c1_c2` (`c1`,`c2`),
        KEY `ix_c1_c3` (`c1`,`c3`),
        KEY `ix_c3_dt` (`c3`,`dt`),
        KEY `ix_ts_c1` (`ts`,`c1`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci
      1 row in set (0,000 sec)
       
      MariaDB [test]> insert into tts (`id`, `value`) values (0, concat('{\"creationTS\":1613655148058,\"executionDate\":1613655115666,\"c1\":\"value1\",\"c2\":1613655148058,\"c3\":"\long value\",\"c4\":1613655115666,\"c5\":123456,\"c6\":1234567,\"c7\":7777777,\"c8\":888888888,"\c9\":99999,"\c10\":\"111111111\","\c11\":111111,"\c12\":1212121212,"\c0\":\"0000000000000000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111111111111111111111110000000\",\"data\":', concat(repeat(2,100400), '}')));
      Query OK, 1 row affected (0,274 sec)
       
      MariaDB [test]> select id, ts, dt, c2, c4 from tts;
      +----+---------------+---------------+---------------+---------------+
      | id | ts            | dt            | c2            | c4            |
      +----+---------------+---------------+---------------+---------------+
      |  0 | 1613655148058 | 1613655115666 | 1613655148058 | 1613655115666 |
      +----+---------------+---------------+---------------+---------------+
      1 row in set (0,013 sec)
       
      MariaDB [test]> show table status like 'tts'\G
      *************************** 1. row ***************************
                  Name: tts
                Engine: InnoDB
               Version: 11
            Row_format: Dynamic
                  Rows: 1
        Avg_row_length: 16384
           Data_length: 16384
       Max_data_length: 0
          Index_length: 311296
             Data_free: 0
        Auto_increment: NULL
           Create_time: 2024-02-01 11:58:04
           Update_time: 2024-02-01 12:32:00
            Check_time: NULL
             Collation: latin1_swedish_ci
              Checksum: NULL
        Create_options: 
               Comment: 
      Max_index_length: 0
             Temporary: N
      1 row in set (0,001 sec)
      

      The create a shell script that will update the row twice at each step, like this:

      openxs@ao756:~/dbs/maria10.6$ cat ~/tts.sh 
      #!/bin/bash
      currentTimeMillis_OLD_VALUE=1613655115666
      executionDate_OLD_VALUE=1613655148058
       
      while [ true ]
      do
      if [[ -f /tmp/touch.go ]]; then
        echo "Gracefully stopping"
        exit
      else
      currentTimeMillis_NEW_VALUE=$((currentTimeMillis_OLD_VALUE+1))
      executionDate_NEW_VALUE=$((executionDate_OLD_VALUE+1))
      echo "UPDATE tts SET value = REGEXP_REPLACE(value, '\":${currentTimeMillis_OLD_VALUE}','\":${currentTimeMillis_NEW_VALUE}');" > /tmp/tmp.sql
      echo "UPDATE tts SET value = REGEXP_REPLACE(value, '\":${executionDate_OLD_VALUE}','\":${executionDate_NEW_VALUE}');" >> /tmp/tmp.sql
      bin/mysql --socket=/tmp/node1.sock test <<EOF
      source /tmp/tmp.sql;
      EOF
      currentTimeMillis_OLD_VALUE=$((currentTimeMillis_OLD_VALUE+1))
      executionDate_OLD_VALUE=$((executionDate_OLD_VALUE+1))
      fi
      done
      

      Run the script (that do UPDATEs on primary) for some time, depending on the hardware:

      openxs@ao756:~/dbs/maria10.6$ /bin/bash ~/tts.sh 
      
      

      Note that it is single threaded. Eventually (I had to wait for dozens of minutes), it fails and replication stops:

      openxs@ao756:~/dbs/maria10.6$ bin/mysql --socket=/tmp/node2.sock test
      Reading table information for completion of table and column names
      You can turn off this feature to get a quicker startup with -A
       
      Welcome to the MariaDB monitor.  Commands end with ; or \g.
      Your MariaDB connection id is 1544
      Server version: 10.6.17-MariaDB-log Source distribution
       
      Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
       
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
       
      MariaDB [test]> show slave status\G
      *************************** 1. row ***************************
                      Slave_IO_State: Waiting for master to send event
                         Master_Host: 192.168.1.19
                         Master_User: repl
                         Master_Port: 3306
                       Connect_Retry: 60
                     Master_Log_File: ao756-bin.000004
                 Read_Master_Log_Pos: 108315415
                      Relay_Log_File: ao756-relay-bin.000005
                       Relay_Log_Pos: 401673374
               Relay_Master_Log_File: ao756-bin.000002
                    Slave_IO_Running: Yes
                   Slave_SQL_Running: No
                     Replicate_Do_DB: 
                 Replicate_Ignore_DB: 
                  Replicate_Do_Table: 
              Replicate_Ignore_Table: 
             Replicate_Wild_Do_Table: 
         Replicate_Wild_Ignore_Table: 
                          Last_Errno: 1034
                          Last_Error: Could not execute Update_rows_v1 event on table test.tts; Index for table 'tts' is corrupt; try to repair it, Error_code: 1034; handler error HA_ERR_CRASHED; the event's master log ao756-bin.000002, end_log_pos 402081942
                        Skip_Counter: 0
                 Exec_Master_Log_Pos: 401879515
                     Relay_Log_Space: 3323547850
                     Until_Condition: None
                      Until_Log_File: 
                       Until_Log_Pos: 0
                  Master_SSL_Allowed: No
                  Master_SSL_CA_File: 
                  Master_SSL_CA_Path: 
                     Master_SSL_Cert: 
                   Master_SSL_Cipher: 
                      Master_SSL_Key: 
               Seconds_Behind_Master: NULL
       Master_SSL_Verify_Server_Cert: No
                       Last_IO_Errno: 0
                       Last_IO_Error: 
                      Last_SQL_Errno: 1034
                      Last_SQL_Error: Could not execute Update_rows_v1 event on table test.tts; Index for table 'tts' is corrupt; try to repair it, Error_code: 1034; handler error HA_ERR_CRASHED; the event's master log ao756-bin.000002, end_log_pos 402081942
         Replicate_Ignore_Server_Ids: 
                    Master_Server_Id: 1
                      Master_SSL_Crl: 
                  Master_SSL_Crlpath: 
                          Using_Gtid: No
                         Gtid_IO_Pos: 
             Replicate_Do_Domain_Ids: 
         Replicate_Ignore_Domain_Ids: 
                       Parallel_Mode: optimistic
                           SQL_Delay: 0
                 SQL_Remaining_Delay: NULL
             Slave_SQL_Running_State: 
                    Slave_DDL_Groups: 4
      Slave_Non_Transactional_Groups: 0
          Slave_Transactional_Groups: 90128
      1 row in set (0,050 sec)
      

      In the error log of replica we see the following (note that it took me 30+ minutes of wait until the problem happened):

      ...
      2024-02-01 11:59:24 764 [Note] Slave I/O thread: Start asynchronous replication to master 'repl@192.168.1.19:3306' in log 'ao756-bin.000001' at position 328
      2024-02-01 11:59:24 764 [Note] Slave I/O thread: connected to master 'repl@192.168.1.19:3306',replication started in log 'ao756-bin.000001' at position 328
      2024-02-01 11:59:24 765 [Note] Slave SQL thread initialized, starting replication in log 'ao756-bin.000001' at position 328, relay log './ao756-relay-bin.000001' position: 4
      2024-02-01 12:32:48 765 [Note] Error reading relay log event: slave SQL thread was killed
      2024-02-01 12:32:48 765 [Note] Slave SQL thread exiting, replication stopped in log 'ao756-bin.000001' at position 4649744, master: 192.168.1.19:3306
      2024-02-01 12:32:48 764 [Note] Slave I/O thread exiting, read up to log 'ao756-bin.000001', position 4649744, master 192.168.1.19:3306
      2024-02-01 12:32:52 1117 [Note] Slave I/O thread: Start asynchronous replication to master 'repl@192.168.1.19:3306' in log 'ao756-bin.000001' at position 4649744
      2024-02-01 12:32:52 1117 [Note] Slave I/O thread: connected to master 'repl@192.168.1.19:3306',replication started in log 'ao756-bin.000001' at position 4649744
      2024-02-01 12:32:52 1118 [Note] Slave SQL thread initialized, starting replication in log 'ao756-bin.000001' at position 4649744, relay log './ao756-relay-bin.000002' position: 4649971
      2024-02-01 13:04:49 1120 [ERROR] InnoDB: We detected index corruption in an InnoDB type table. You have to dump + drop + reimport the table or, in a case of widespread corruption, dump all InnoDB tables and recreate the whole tablespace. If the mariadbd server crashes after the startup or when you dump the tables. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
      2024-02-01 13:04:49 1120 [ERROR] mariadbd: Index for table 'tts' is corrupt; try to repair it
      2024-02-01 13:04:49 1120 [ERROR] InnoDB: We detected index corruption in an InnoDB type table. You have to dump + drop + reimport the table or, in a case of widespread corruption, dump all InnoDB tables and recreate the whole tablespace. If the mariadbd server crashes after the startup or when you dump the tables. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
      2024-02-01 13:04:49 1120 [ERROR] mariadbd: Index for table 'tts' is corrupt; try to repair it
      2024-02-01 13:04:49 1120 [ERROR] Slave SQL: Could not execute Update_rows_v1 event on table test.tts; Index for table 'tts' is corrupt; try to repair it, Error_code: 1034; handler error HA_ERR_CRASHED; the event's master log ao756-bin.000002, end_log_pos 402081942, Gtid 0-1-45135, Internal MariaDB error code: 1034
      2024-02-01 13:04:49 1120 [Warning] Slave: Index for table 'tts' is corrupt; try to repair it Error_code: 1034
      2024-02-01 13:04:49 1120 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'ao756-bin.000002' position 401879515
      2024-02-01 13:04:49 1118 [Note] Error reading relay log event: slave SQL thread was killed
      2024-02-01 13:04:49 1119 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
      2024-02-01 13:04:49 1119 [Warning] Slave: Connection was killed Error_code: 1927
      2024-02-01 13:04:49 1119 [Warning] Slave: Connection was killed Error_code: 1927
      2024-02-01 13:04:49 1119 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
      2024-02-01 13:04:49 1119 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
      2024-02-01 13:04:49 1119 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'ao756-bin.000002' position 401879515
      2024-02-01 13:04:49 1121 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
      2024-02-01 13:04:49 1121 [Warning] Slave: Connection was killed Error_code: 1927
      2024-02-01 13:04:49 1121 [Warning] Slave: Connection was killed Error_code: 1927
      2024-02-01 13:04:49 1121 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
      2024-02-01 13:04:49 1121 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
      2024-02-01 13:04:49 1121 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'ao756-bin.000002' position 401879515
      2024-02-01 13:04:49 1122 [ERROR] InnoDB: We detected index corruption in an InnoDB type table. You have to dump + drop + reimport the table or, in a case of widespread corruption, dump all InnoDB tables and recreate the whole tablespace. If the mariadbd server crashes after the startup or when you dump the tables. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
      2024-02-01 13:04:49 1122 [ERROR] mariadbd: Index for table 'tts' is corrupt; try to repair it
      2024-02-01 13:04:49 1118 [Note] Slave SQL thread exiting, replication stopped in log 'ao756-bin.000002' at position 401879515, master: 192.168.1.19:3306
      

      The table is OK on primary, but CHECK TABLE on replica reports corruption(s):

      openxs@ao756:~/dbs/maria10.6$ bin/mysql --socket=/tmp/node1.sock test -e 'check table tts';
      +----------+-------+----------+----------+
      | Table    | Op    | Msg_type | Msg_text |
      +----------+-------+----------+----------+
      | test.tts | check | status   | OK       |
      +----------+-------+----------+----------+
      openxs@ao756:~/dbs/maria10.6$ bin/mysql --socket=/tmp/node2.sock test -e 'check table tts';
      +----------+-------+----------+-----------------------------------------------------+
      | Table    | Op    | Msg_type | Msg_text                                            |
      +----------+-------+----------+-----------------------------------------------------+
      | test.tts | check | Warning  | InnoDB: Index 'c4' contains 3 entries, should be 1. |
      | test.tts | check | error    | Corrupt                                             |
      +----------+-------+----------+-----------------------------------------------------+

      in one of indexes that involve column(s) that we change (indirectly) by the script.

      Attachments

        Issue Links

          Activity

            People

              Elkin Andrei Elkin
              valerii Valerii Kravchuk
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.