Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32873

Test innodb.innodb-index-online occasionally fails

Details

    Description

      The test innodb.innodb-index-online often fails like this on 10.6 or later branches. I suspect that it could be due to some changed timing characteristics related to MDEV-32050.

      11.0 5b6134b040b59d6c896ec0a2c86029a0bcca788c

      innodb.innodb-index-online '4k,clear,innodb' w19 [ fail ]
              Test ended at 2023-11-24 10:49:23
       
      CURRENT_TEST: innodb.innodb-index-online
      --- /mariadb/11/mysql-test/suite/innodb/r/innodb-index-online.result	2023-11-24 10:15:47.229132350 +0200
      +++ /mariadb/11/mysql-test/suite/innodb/r/innodb-index-online.reject	2023-11-24 10:49:23.174794627 +0200
      @@ -170,8 +170,8 @@
       ERROR 70100: Query execution was interrupted
       SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
       name	count
      -ddl_background_drop_indexes	0
      -ddl_online_create_index	0
      +ddl_background_drop_indexes	1
      +ddl_online_create_index	1
       ddl_pending_alter_table	0
       ddl_sort_file_alter_table	0
       ddl_log_file_alter_table	0
       
      mysqltest: Result content mismatch
      

      The test scenario is the KILL of the following statement:

      SET DEBUG_SYNC = 'row_log_apply_before SIGNAL c2d_created WAIT_FOR kill_done';
      CREATE INDEX c2d ON t1(c2);
      

      When this operation is rolled back, in row_merge_drop_indexes() we would normally have table->get_ref_count() == 1. In a failure case, a purge operation would intervene and open a table handle while row_merge_build_indexes() is blocked by the DEBUG_SYNC. This extra table handle would remain open across the rollback operation.

      The following patch (which depends on MDEV-32050, where we started to ignore innodb_purge_rseg_truncate_frequency) will make the test stable for me. The first hunk makes the test ./mtr --repeat proof, for easier testing.

      diff --git a/mysql-test/suite/innodb/t/innodb-index-online.test b/mysql-test/suite/innodb/t/innodb-index-online.test
      index 3a38afa84b7..059c9e01a0c 100644
      --- a/mysql-test/suite/innodb/t/innodb-index-online.test
      +++ b/mysql-test/suite/innodb/t/innodb-index-online.test
      @@ -4,6 +4,10 @@
       --source include/have_debug_sync.inc
       --source include/no_valgrind_without_big.inc
       
      +SET GLOBAL innodb_monitor_reset_all=all;
      +--disable_warnings
      +SET GLOBAL innodb_monitor_reset_all=default;
      +--enable_warnings
       let $innodb_metrics_select=
       SELECT name, count FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE subsystem = 'ddl';
       
      @@ -154,6 +158,7 @@ let $ID= `SELECT @id := CONNECTION_ID()`;
       --error ER_QUERY_INTERRUPTED
       KILL QUERY @id;
       
      +SET GLOBAL innodb_max_purge_lag_wait=0;
       SET DEBUG_SYNC = 'row_log_apply_before SIGNAL c2d_created WAIT_FOR kill_done';
       --send
       CREATE INDEX c2d ON t1(c2);
      

      Attachments

        Issue Links

          Activity

            With the above fix, the following invocation completed successfully:

            ./mtr --repeat=100 --parallel=48 innodb.innodb-index-online{,,,}{,}
            

            11.0 5b6134b040b59d6c896ec0a2c86029a0bcca788c

            innodb.innodb-index-online '16k,crypt,innodb' w8 [ 100 pass ]   5817
            --------------------------------------------------------------------------
            The servers were restarted 0 times
            Spent 27490.284 of 680 seconds executing testcases
             
            Completed: All 4800 tests were successful.
            

            I checked also the ‘sibling’ test that covers table-rebuilding online ALTER TABLE. It does not seem to require any action, because it last failed on 2021-10-06, and mostly on Microsoft Windows. Here is the latest non-Windows failure (one day before the last failure on Windows):

            10.6 4ca56e8348c7bd20cb7544a607b730d6

            innodb.innodb-table-online '4k,clear,innodb' w4 [ fail ]
                    Test ended at 2021-10-05 08:57:46
             
            CURRENT_TEST: innodb.innodb-table-online
            --- /usr/share/mysql-test/suite/innodb/r/innodb-table-online.result	2021-10-05 07:22:49.000000000 +0000
            +++ /dev/shm/var/4/log/innodb-table-online.reject	2021-10-05 08:57:46.150463032 +0000
            @@ -212,7 +212,7 @@
             ddl_online_create_index	1
             ddl_pending_alter_table	1
             ddl_sort_file_alter_table	0
            -ddl_log_file_alter_table	1
            +ddl_log_file_alter_table	0
             BEGIN;
             DELETE FROM t1;
             ROLLBACK;
             
            mysqltest: Result content mismatch
            

            marko Marko Mäkelä added a comment - With the above fix, the following invocation completed successfully: . /mtr --repeat=100 --parallel=48 innodb.innodb-index-online{,,,}{,} 11.0 5b6134b040b59d6c896ec0a2c86029a0bcca788c innodb.innodb-index-online '16k,crypt,innodb' w8 [ 100 pass ] 5817 -------------------------------------------------------------------------- The servers were restarted 0 times Spent 27490.284 of 680 seconds executing testcases   Completed: All 4800 tests were successful. I checked also the ‘sibling’ test that covers table-rebuilding online ALTER TABLE . It does not seem to require any action, because it last failed on 2021-10-06, and mostly on Microsoft Windows. Here is the latest non-Windows failure (one day before the last failure on Windows): 10.6 4ca56e8348c7bd20cb7544a607b730d6 innodb.innodb-table-online '4k,clear,innodb' w4 [ fail ] Test ended at 2021-10-05 08:57:46   CURRENT_TEST: innodb.innodb-table-online --- /usr/share/mysql-test/suite/innodb/r/innodb-table-online.result 2021-10-05 07:22:49.000000000 +0000 +++ /dev/shm/var/4/log/innodb-table-online.reject 2021-10-05 08:57:46.150463032 +0000 @@ -212,7 +212,7 @@ ddl_online_create_index 1 ddl_pending_alter_table 1 ddl_sort_file_alter_table 0 -ddl_log_file_alter_table 1 +ddl_log_file_alter_table 0 BEGIN; DELETE FROM t1; ROLLBACK;   mysqltest: Result content mismatch

            After making the test innodb.innodb-table-online work with ./mtr --repeat, I was able to reproduce a result difference for it as well:

            11.0 5b6134b040b59d6c896ec0a2c86029a0bcca788c

            innodb.innodb-table-online '16k,crypt,innodb' w6 [ 2 fail ]
                    Test ended at 2023-11-24 12:54:43
             
            CURRENT_TEST: innodb.innodb-table-online
            --- /mariadb/11/mysql-test/suite/innodb/r/innodb-table-online,crypt.result~	2023-11-24 12:54:34.235128293 +0200
            +++ /mariadb/11/mysql-test/suite/innodb/r/innodb-table-online,crypt.reject	2023-11-24 12:54:43.067237664 +0200
            @@ -251,7 +251,7 @@
             @merge_encrypt_1>@merge_encrypt_0, @merge_decrypt_1>@merge_decrypt_0,
             @rowlog_encrypt_1>@rowlog_encrypt_0;
             sort_balance	@merge_encrypt_1>@merge_encrypt_0	@merge_decrypt_1>@merge_decrypt_0	@rowlog_encrypt_1>@rowlog_encrypt_0
            -0	0	0	1
            +0	0	0	0
             SET DEBUG_SYNC = 'now SIGNAL dml2_done';
             connection con1;
             ERROR HY000: Creating index 'PRIMARY' required more than 'innodb_online_alter_log_max_size' bytes of modification log. Please try again
             
            mysqltest: Result content mismatch
            

            This has something to do with the encryption of the log of concurrent DML operations. I will try for some time if I can reproduce this with rr, but it does not look promising.

            marko Marko Mäkelä added a comment - After making the test innodb.innodb-table-online work with ./mtr --repeat , I was able to reproduce a result difference for it as well: 11.0 5b6134b040b59d6c896ec0a2c86029a0bcca788c innodb.innodb-table-online '16k,crypt,innodb' w6 [ 2 fail ] Test ended at 2023-11-24 12:54:43   CURRENT_TEST: innodb.innodb-table-online --- /mariadb/11/mysql-test/suite/innodb/r/innodb-table-online,crypt.result~ 2023-11-24 12:54:34.235128293 +0200 +++ /mariadb/11/mysql-test/suite/innodb/r/innodb-table-online,crypt.reject 2023-11-24 12:54:43.067237664 +0200 @@ -251,7 +251,7 @@ @merge_encrypt_1>@merge_encrypt_0, @merge_decrypt_1>@merge_decrypt_0, @rowlog_encrypt_1>@rowlog_encrypt_0; sort_balance @merge_encrypt_1>@merge_encrypt_0 @merge_decrypt_1>@merge_decrypt_0 @rowlog_encrypt_1>@rowlog_encrypt_0 -0 0 0 1 +0 0 0 0 SET DEBUG_SYNC = 'now SIGNAL dml2_done'; connection con1; ERROR HY000: Creating index 'PRIMARY' required more than 'innodb_online_alter_log_max_size' bytes of modification log. Please try again   mysqltest: Result content mismatch This has something to do with the encryption of the log of concurrent DML operations. I will try for some time if I can reproduce this with rr , but it does not look promising.

            I filed MDEV-32874 for the innodb.innodb-table-online,crypt result difference.

            marko Marko Mäkelä added a comment - I filed MDEV-32874 for the innodb.innodb-table-online,crypt result difference.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.