Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18183

Server startup fails while dropping garbage encrypted tablespace

Details

    Description

      10.4 30da40bb8

      mysqld: /data/src/10.4/storage/innobase/include/sync0policy.h:62: MutexDebug<Mutex>::Context::Context(latch_id_t) [with Mutex = TTASEventMutex<GenericPolicy>]: Assertion `id != LATCH_ID_NONE' failed.
      190109 14:21:52 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f4a57fb6ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x000055e6d4a192e9 in MutexDebug<TTASEventMutex<GenericPolicy> >::Context::Context (this=0x7ffce2e1e520, id=LATCH_ID_NONE) at /data/src/10.4/storage/innobase/include/sync0policy.h:62
      #9  0x000055e6d4a17b0a in MutexDebug<TTASEventMutex<GenericPolicy> >::enter (this=0x55e6d643a1d0 <fil_crypt_threads_mutex+16>, mutex=0x55e6d643a1c0 <fil_crypt_threads_mutex>, name=0x55e6d52a7ec0 "/data/src/10.4/storage/innobase/fil/fil0crypt.cc", line=2370) at /data/src/10.4/storage/innobase/include/sync0policy.ic:66
      #10 0x000055e6d4a168fa in GenericPolicy<TTASEventMutex<GenericPolicy> >::enter (this=0x55e6d643a1d0 <fil_crypt_threads_mutex+16>, mutex=..., filename=0x55e6d52a7ec0 "/data/src/10.4/storage/innobase/fil/fil0crypt.cc", line=2370) at /data/src/10.4/storage/innobase/include/sync0policy.h:345
      #11 0x000055e6d4a14cf7 in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x55e6d643a1c0 <fil_crypt_threads_mutex>, n_spins=30, n_delay=4, name=0x55e6d52a7ec0 "/data/src/10.4/storage/innobase/fil/fil0crypt.cc", line=2370) at /data/src/10.4/storage/innobase/include/ib0mutex.h:594
      #12 0x000055e6d4d28e32 in fil_space_crypt_close_tablespace (space=0x55e6d8a1c340) at /data/src/10.4/storage/innobase/fil/fil0crypt.cc:2370
      #13 0x000055e6d4d17df4 in fil_check_pending_operations (id=639, operation=FIL_OPERATION_DELETE, space=0x7ffce2e1e860, path=0x7ffce2e1e858) at /data/src/10.4/storage/innobase/fil/fil0fil.cc:2392
      #14 0x000055e6d4d18377 in fil_delete_tablespace (id=639, drop_ahi=false) at /data/src/10.4/storage/innobase/fil/fil0fil.cc:2551
      #15 0x000055e6d4b601b4 in row_drop_table_for_mysql (name=0x55e6d8a1e9a8 "test/#sql-1910_e#P#ver_p1", trx=0x7f4a549b7118, sqlcom=SQLCOM_DROP_TABLE, create_failed=false, nonatomic=true) at /data/src/10.4/storage/innobase/row/row0mysql.cc:3768
      #16 0x000055e6d4b5dc25 in row_mysql_drop_garbage_tables () at /data/src/10.4/storage/innobase/row/row0mysql.cc:2771
      #17 0x000055e6d4aaf788 in recv_recovery_rollback_active () at /data/src/10.4/storage/innobase/log/log0recv.cc:3563
      #18 0x000055e6d4bc5c54 in srv_start (create_new_db=false) at /data/src/10.4/storage/innobase/srv/srv0start.cc:2145
      #19 0x000055e6d49edf5f in innodb_init (p=0x55e6d7e99550) at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:4252
      #20 0x000055e6d46c1f8d in ha_initialize_handlerton (plugin=0x55e6d7dcb608) at /data/src/10.4/sql/handler.cc:523
      #21 0x000055e6d43c8deb in plugin_initialize (tmp_root=0x7ffce2e26360, plugin=0x55e6d7dcb608, argc=0x55e6d5baaf30 <remaining_argc>, argv=0x55e6d7d753d8, options_only=false) at /data/src/10.4/sql/sql_plugin.cc:1437
      #22 0x000055e6d43c9a80 in plugin_init (argc=0x55e6d5baaf30 <remaining_argc>, argv=0x55e6d7d753d8, flags=0) at /data/src/10.4/sql/sql_plugin.cc:1720
      #23 0x000055e6d4287959 in init_server_components () at /data/src/10.4/sql/mysqld.cc:5333
      #24 0x000055e6d4288abe in mysqld_main (argc=19, argv=0x55e6d7d753d8) at /data/src/10.4/sql/mysqld.cc:5860
      #25 0x000055e6d427cf20 in main (argc=19, argv=0x7ffce2e26b98) at /data/src/10.4/sql/main.cc:25
      

      The backup dir (stored after prepare but before any server startup was attempted) is attached. To reproduce the failure, restore it to the datadir and start the server with

      --innodb-encrypt-tables --innodb-encrypt-log --innodb-encryption-threads=4 --aria-encrypt-tables=1 --encrypt-tmp-disk-tables=1 --file-key-management --plugin-load-add=file_key_management --file-key-management-filename=`pwd`/mysql-test/std_data/keys.txt
      

      (modify file-key-management-filename value as needed).


      Notes on the origin of the report. Please be aware it's not easily reproducible the same way, the test is concurrent by nature, and apparently there is a race condition involved.

      The problem initially happened by running the concurrent test with MariaBackupFull scenario:

      github.com/MariaDB/randgen elenst-dev 5342873c97, github.com/elenst/mariadb-toolbox master 6ded4b71568

      perl ./runall-new.pl --basedir=/home/travis/server/10.4 --vardir=/home/travis/logs/vardir --duration=350 --threads=4 --seed=1547032787 --reporters=Backtrace,ErrorLog,Deadlock --skip-gendata --gendata-advanced --views --grammar=conf/mariadb/generic-dml.yy --redefine=conf/mariadb/bulk_insert.yy --mysqld=--log_output=FILE --mysqld=--max-statement-time=20 --mysqld=--lock-wait-timeout=10 --mysqld=--loose-innodb-lock-wait-timeout=5 --mysqld=--loose-debug_assert_on_not_freed_memory=0 --filter=/home/travis/mariadb-toolbox/travis/10.4-combo-filter.ff --mysqld=--innodb-encrypt-tables --mysqld=--innodb-encrypt-log --mysqld=--innodb-encryption-threads=4 --mysqld=--aria-encrypt-tables=1 --mysqld=--encrypt-tmp-disk-tables=1 --mysqld=--file-key-management --mysqld=--file-key-management-filename=/home/travis/mariadb-toolbox/data/keys.txt --mysqld=--plugin-load-add=file_key_management --engine=Aria --scenario=MariaBackupFull --redefine=conf/mariadb/alter_table.yy --redefine=conf/mariadb/modules/admin.yy --redefine=conf/mariadb/versioning.yy --redefine=conf/mariadb/sequences.yy
      

      The test performs the following steps:

      • start the server with

        /home/travis/server/bin/mysqld --no-defaults --basedir=/home/travis/server --lc-messages-dir=/home/travis/server/share --character-sets-dir=/home/travis/server/share/charsets --tmpdir=/home/travis/logs/vardir/tmp --core-file --datadir=/home/travis/logs/vardir/data --max-allowed-packet=128Mb --port=19300 --socket=/home/travis/logs/vardir/mysql.sock --pid-file=/home/travis/logs/vardir/mysql.pid --general-log --general-log-file=/home/travis/logs/vardir/mysql.log --log_output=FILE --max-statement-time=20 --lock-wait-timeout=10 --loose-innodb-lock-wait-timeout=5 --loose-debug_assert_on_not_freed_memory=0 --innodb-encrypt-tables --innodb-encrypt-log --innodb-encryption-threads=4 --aria-encrypt-tables=1 --encrypt-tmp-disk-tables=1 --file-key-management --file-key-management-filename=/home/travis/mariadb-toolbox/data/keys.txt --plugin-load-add=file_key_management
        

      • run concurrent DML/DDL flow on the server;
      • after ~100 seconds, take the full backup from the running server (succeeded, the log attached as mbackup_backup.log;
      • after another ~100 seconds, shut down the server gracefully (succeeded, the error log is attached as mysql.err_orig;
      • run mariabackup --prepare (succeded, the log attached as mbackup_prepare.log;
      • cleanup server datadir and restore the backup to it (succeeded, the log attached as mbackup_restore.log;
      • try to start the server with the same options as before.

      I have no information yet whether the problem happens on earlier versions.

      Attachments

        1. backup.tar.gz
          4.85 MB
        2. mbackup_backup.log
          57 kB
        3. mbackup_prepare.log
          12 kB
        4. mbackup_restore.log
          17 kB
        5. mdev-18183-10.2v1.patch
          3 kB
        6. mysql.err_orig
          191 kB

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            I'm not sure if it's enough information to analyze the problem. Please return the report to me if it's not.

            elenst Elena Stepanova added a comment - - edited I'm not sure if it's enough information to analyze the problem. Please return the report to me if it's not.

            I reassigned it to marko, for the crash happens on startup, and seems to be more recovery related than backup related.
            However I created MDEV-18184 , since copying temporary files into backup does not make a lot of sense.

            wlad Vladislav Vaintroub added a comment - I reassigned it to marko , for the crash happens on startup, and seems to be more recovery related than backup related. However I created MDEV-18184 , since copying temporary files into backup does not make a lot of sense.

            --source include/have_innodb.inc
            -- source include/have_file_key_management_plugin.inc
             
            let $MYSQLD_DATADIR= `select @@datadir`;
            let datadir= `select @@datadir`;
             
            # These are from include/shutdown_mysqld.inc and allow to call start_mysqld.inc
            --let $_server_id= `SELECT @@server_id`
            --let $_expect_file_name= $MYSQLTEST_VARDIR/tmp/mysqld.$_server_id.expect
             
            create table t1(f1 int not null, f2 int not null)engine=innodb encrypted=yes;
            insert into t1 values(1, 2), (2, 3), (4, 5), (5, 6), (7, 8);
            insert into t1 select * from t1;
            SET DEBUG_DBUG='+d,innodb_alter_commit_crash_before_commit';
             
            # Write file to make mysql-test-run.pl expect crash
            --exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
             
            --error 2013
            alter table t1 force, algorithm=inplace;
             
            --source include/start_mysqld.inc
            select * from t1;
            drop table t1;
            

            The issue repeats with above test case.

            thiru Thirunarayanan Balathandayuthapani added a comment - --source include/have_innodb.inc -- source include/have_file_key_management_plugin.inc   let $MYSQLD_DATADIR= `select @@datadir`; let datadir= `select @@datadir`;   # These are from include/shutdown_mysqld.inc and allow to call start_mysqld.inc --let $_server_id= `SELECT @@server_id` --let $_expect_file_name= $MYSQLTEST_VARDIR/tmp/mysqld.$_server_id.expect   create table t1(f1 int not null, f2 int not null)engine=innodb encrypted=yes; insert into t1 values(1, 2), (2, 3), (4, 5), (5, 6), (7, 8); insert into t1 select * from t1; SET DEBUG_DBUG='+d,innodb_alter_commit_crash_before_commit';   # Write file to make mysql-test-run.pl expect crash --exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect   --error 2013 alter table t1 force, algorithm=inplace;   --source include/start_mysqld.inc select * from t1; drop table t1; The issue repeats with above test case.

            The latest 10.4 build produces a different assertion failure on the provided data directory and similar backup tests:

            10.4 301bd62b253

            mysqld: /home/travis/src/storage/innobase/include/sync0types.h:451: void OSMutex::enter(): Assertion `!m_freed' failed.
            190110 21:35:48 [ERROR] mysqld got signal 6 ;
             
            #7  0x00007f4a7bdd7c82 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
            #8  0x000055ba2dedde5c in OSMutex::enter (this=0x55ba2f1d7340 <fil_crypt_threads_mutex+64>) at /home/travis/src/storage/innobase/include/sync0types.h:451
            #9  0x000055ba2dee8425 in MutexDebug<TTASEventMutex<GenericPolicy> >::get (this=0x55ba2f1d7310 <fil_crypt_threads_mutex+16>) at /home/travis/src/storage/innobase/include/sync0policy.h:66
            #10 0x000055ba2dee6ab4 in MutexDebug<TTASEventMutex<GenericPolicy> >::get_thread_id (this=0x55ba2f1d7310 <fil_crypt_threads_mutex+16>) at /home/travis/src/storage/innobase/include/sync0policy.h:173
            #11 0x000055ba2dee5609 in MutexDebug<TTASEventMutex<GenericPolicy> >::is_owned (this=0x55ba2f1d7310 <fil_crypt_threads_mutex+16>) at /home/travis/src/storage/innobase/include/sync0policy.h:160
            #12 0x000055ba2dee56f1 in MutexDebug<TTASEventMutex<GenericPolicy> >::enter (this=0x55ba2f1d7310 <fil_crypt_threads_mutex+16>, mutex=..., filename=0x55ba2e837858 "/home/travis/src/storage/innobase/fil/fil0crypt.cc", line=2370) at /home/travis/src/storage/innobase/include/sync0policy.h:121
            #13 0x000055ba2dee3a17 in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x55ba2f1d7300 <fil_crypt_threads_mutex>, n_spins=30, n_delay=4, name=0x55ba2e837858 "/home/travis/src/storage/innobase/fil/fil0crypt.cc", line=2370) at /home/travis/src/storage/innobase/include/ib0mutex.h:593
            #14 0x000055ba2e1fe367 in fil_space_crypt_close_tablespace (space=0x55ba313d0870) at /home/travis/src/storage/innobase/fil/fil0crypt.cc:2370
            #15 0x000055ba2e1ecd6c in fil_check_pending_operations (id=1031, operation=FIL_OPERATION_DELETE, space=0x7ffc7daedf60, path=0x7ffc7daedf58) at /home/travis/src/storage/innobase/fil/fil0fil.cc:2392
            #16 0x000055ba2e1ed2f5 in fil_delete_tablespace (id=1031, drop_ahi=false) at /home/travis/src/storage/innobase/fil/fil0fil.cc:2551
            #17 0x000055ba2e031af6 in row_drop_table_for_mysql (name=0x55ba313dca18 "test/#sql-6044_f", trx=0x7f4a78852138, sqlcom=SQLCOM_DROP_TABLE, create_failed=false, nonatomic=true) at /home/travis/src/storage/innobase/row/row0mysql.cc:3768
            #18 0x000055ba2e02f4e6 in row_mysql_drop_garbage_tables () at /home/travis/src/storage/innobase/row/row0mysql.cc:2771
            #19 0x000055ba2df8019c in recv_recovery_rollback_active () at /home/travis/src/storage/innobase/log/log0recv.cc:3563
            #20 0x000055ba2e097eeb in srv_start (create_new_db=false) at /home/travis/src/storage/innobase/srv/srv0start.cc:2145
            #21 0x000055ba2debbc86 in innodb_init (p=0x55ba300bd210) at /home/travis/src/storage/innobase/handler/ha_innodb.cc:4252
            #22 0x000055ba2dca0a32 in ha_initialize_handlerton (plugin=0x55ba2fff4e58) at /home/travis/src/sql/handler.cc:523
            #23 0x000055ba2d9a6b88 in plugin_initialize (tmp_root=0x7ffc7daf59e0, plugin=0x55ba2fff4e58, argc=0x55ba2f157250 <remaining_argc>, argv=0x55ba2ff9f198, options_only=false) at /home/travis/src/sql/sql_plugin.cc:1437
            #24 0x000055ba2d9a783c in plugin_init (argc=0x55ba2f157250 <remaining_argc>, argv=0x55ba2ff9f198, flags=0) at /home/travis/src/sql/sql_plugin.cc:1719
            #25 0x000055ba2d864ee3 in init_server_components () at /home/travis/src/sql/mysqld.cc:5332
            #26 0x000055ba2d866046 in mysqld_main (argc=29, argv=0x55ba2ff9f198) at /home/travis/src/sql/mysqld.cc:5861
            #27 0x000055ba2d85a440 in main (argc=29, argv=0x7ffc7daf6218) at /home/travis/src/sql/main.cc:25
            

            elenst Elena Stepanova added a comment - The latest 10.4 build produces a different assertion failure on the provided data directory and similar backup tests: 10.4 301bd62b253 mysqld: /home/travis/src/storage/innobase/include/sync0types.h:451: void OSMutex::enter(): Assertion `!m_freed' failed. 190110 21:35:48 [ERROR] mysqld got signal 6 ;   #7 0x00007f4a7bdd7c82 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 #8 0x000055ba2dedde5c in OSMutex::enter (this=0x55ba2f1d7340 <fil_crypt_threads_mutex+64>) at /home/travis/src/storage/innobase/include/sync0types.h:451 #9 0x000055ba2dee8425 in MutexDebug<TTASEventMutex<GenericPolicy> >::get (this=0x55ba2f1d7310 <fil_crypt_threads_mutex+16>) at /home/travis/src/storage/innobase/include/sync0policy.h:66 #10 0x000055ba2dee6ab4 in MutexDebug<TTASEventMutex<GenericPolicy> >::get_thread_id (this=0x55ba2f1d7310 <fil_crypt_threads_mutex+16>) at /home/travis/src/storage/innobase/include/sync0policy.h:173 #11 0x000055ba2dee5609 in MutexDebug<TTASEventMutex<GenericPolicy> >::is_owned (this=0x55ba2f1d7310 <fil_crypt_threads_mutex+16>) at /home/travis/src/storage/innobase/include/sync0policy.h:160 #12 0x000055ba2dee56f1 in MutexDebug<TTASEventMutex<GenericPolicy> >::enter (this=0x55ba2f1d7310 <fil_crypt_threads_mutex+16>, mutex=..., filename=0x55ba2e837858 "/home/travis/src/storage/innobase/fil/fil0crypt.cc", line=2370) at /home/travis/src/storage/innobase/include/sync0policy.h:121 #13 0x000055ba2dee3a17 in PolicyMutex<TTASEventMutex<GenericPolicy> >::enter (this=0x55ba2f1d7300 <fil_crypt_threads_mutex>, n_spins=30, n_delay=4, name=0x55ba2e837858 "/home/travis/src/storage/innobase/fil/fil0crypt.cc", line=2370) at /home/travis/src/storage/innobase/include/ib0mutex.h:593 #14 0x000055ba2e1fe367 in fil_space_crypt_close_tablespace (space=0x55ba313d0870) at /home/travis/src/storage/innobase/fil/fil0crypt.cc:2370 #15 0x000055ba2e1ecd6c in fil_check_pending_operations (id=1031, operation=FIL_OPERATION_DELETE, space=0x7ffc7daedf60, path=0x7ffc7daedf58) at /home/travis/src/storage/innobase/fil/fil0fil.cc:2392 #16 0x000055ba2e1ed2f5 in fil_delete_tablespace (id=1031, drop_ahi=false) at /home/travis/src/storage/innobase/fil/fil0fil.cc:2551 #17 0x000055ba2e031af6 in row_drop_table_for_mysql (name=0x55ba313dca18 "test/#sql-6044_f", trx=0x7f4a78852138, sqlcom=SQLCOM_DROP_TABLE, create_failed=false, nonatomic=true) at /home/travis/src/storage/innobase/row/row0mysql.cc:3768 #18 0x000055ba2e02f4e6 in row_mysql_drop_garbage_tables () at /home/travis/src/storage/innobase/row/row0mysql.cc:2771 #19 0x000055ba2df8019c in recv_recovery_rollback_active () at /home/travis/src/storage/innobase/log/log0recv.cc:3563 #20 0x000055ba2e097eeb in srv_start (create_new_db=false) at /home/travis/src/storage/innobase/srv/srv0start.cc:2145 #21 0x000055ba2debbc86 in innodb_init (p=0x55ba300bd210) at /home/travis/src/storage/innobase/handler/ha_innodb.cc:4252 #22 0x000055ba2dca0a32 in ha_initialize_handlerton (plugin=0x55ba2fff4e58) at /home/travis/src/sql/handler.cc:523 #23 0x000055ba2d9a6b88 in plugin_initialize (tmp_root=0x7ffc7daf59e0, plugin=0x55ba2fff4e58, argc=0x55ba2f157250 <remaining_argc>, argv=0x55ba2ff9f198, options_only=false) at /home/travis/src/sql/sql_plugin.cc:1437 #24 0x000055ba2d9a783c in plugin_init (argc=0x55ba2f157250 <remaining_argc>, argv=0x55ba2ff9f198, flags=0) at /home/travis/src/sql/sql_plugin.cc:1719 #25 0x000055ba2d864ee3 in init_server_components () at /home/travis/src/sql/mysqld.cc:5332 #26 0x000055ba2d866046 in mysqld_main (argc=29, argv=0x55ba2ff9f198) at /home/travis/src/sql/mysqld.cc:5861 #27 0x000055ba2d85a440 in main (argc=29, argv=0x7ffc7daf6218) at /home/travis/src/sql/main.cc:25

            mdev-18183-10.2v1.patch

            The above patch is for 10.2 version. Basically adding the test case is necessary for 10.2. Adding the test case in 10.1 is no use because dropping
            the garbage table is not enabled. So I request marko to use the test case in the patch while merging the fix to 10.2

            thiru Thirunarayanan Balathandayuthapani added a comment - mdev-18183-10.2v1.patch The above patch is for 10.2 version. Basically adding the test case is necessary for 10.2. Adding the test case in 10.1 is no use because dropping the garbage table is not enabled. So I request marko to use the test case in the patch while merging the fix to 10.2

            A separate crash recovery issue MDEV-18279 was encountered while testing this.

            The test case was only added to 10.2. Starting with 10.2.19 as part of the MDEV-13564 fix, InnoDB would drop orphan tables at startup. We do not know if this problem is repeatable in 10.1 via other means. We fixed the problem there without adding a test case.

            marko Marko Mäkelä added a comment - A separate crash recovery issue MDEV-18279 was encountered while testing this. The test case was only added to 10.2. Starting with 10.2.19 as part of the MDEV-13564  fix, InnoDB would drop orphan tables at startup. We do not know if this problem is repeatable in 10.1 via other means. We fixed the problem there without adding a test case.

            People

              thiru Thirunarayanan Balathandayuthapani
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.