Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25334

FTWRL/Backup blocks DDL on temporary tables with binlog enabled, assertion fails in Diagnostics_area::set_error_status

    XMLWordPrintable

Details

    Description

      The two test cases below differ only in FTWRL vs BACKUP STAGE x 2 and cause the same effect, both on release and debug builds.

      FTWRL

      --source include/have_binlog_format_mixed_or_statement.inc
       
      CREATE TEMPORARY TABLE tmp (a INT);
       
      --connect (con1,localhost,root,,)
      FLUSH TABLES WITH READ LOCK;
       
      --connection default
      SET lock_wait_timeout= 1;
      ALTER TABLE tmp;
      

      BACKUP STAGE

      --source include/have_binlog_format_mixed_or_statement.inc
       
      CREATE TEMPORARY TABLE tmp (a INT);
       
      --connect (con1,localhost,root,,)
      BACKUP STAGE START;
      BACKUP STAGE BLOCK_COMMIT;
       
      --connection default
      SET lock_wait_timeout= 1;
      ALTER TABLE tmp;
      

      The backup lock taken by con1 blocks ALTER on a temporary table in the connection default and makes it eventually fail with a timeout. It affects debug and non-debug builds alike.

      10.4 880baedc

      connection default;
      SET lock_wait_timeout= 1;
      ALTER TABLE tmp;
      bug.t                                    [ fail ]
              Test ended at 2021-04-03 20:00:05
       
      CURRENT_TEST: bug.t
      mysqltest: At line 12: query 'ALTER TABLE tmp' failed: 1205: Lock wait timeout exceeded; try restarting transaction
      

      Debug builds further abort with an assertion failure in diagnostics area:

      10.4 880baedc debug

      mysqld: /data/src/10.4/sql/sql_error.cc:445: void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*): Assertion `! is_set() || m_can_overwrite_status' failed.
      210403 20:02:19 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f081775af36 in __GI___assert_fail (assertion=0x5564e69c1468 "! is_set() || m_can_overwrite_status", file=0x5564e69c1200 "/data/src/10.4/sql/sql_error.cc", line=445, function=0x5564e69c1490 "void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*)") at assert.c:101
      #8  0x00005564e5c71870 in Diagnostics_area::set_error_status (this=0x7f07f40067a8, sql_errno=1205, message=0x7f08116dc3a0 "Lock wait timeout exceeded; try restarting transaction", sqlstate=0x5564e69e6e6d "HY000", ucid=..., error_condition=0x0) at /data/src/10.4/sql/sql_error.cc:445
      #9  0x00005564e5c4b83f in THD::raise_condition (this=0x7f07f4000d90, sql_errno=1205, sqlstate=0x5564e69e6e6d "HY000", level=Sql_state_errno_level::WARN_LEVEL_ERROR, ucid=..., msg=0x7f08116dc3a0 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.cc:1103
      #10 0x00005564e5b8cb50 in THD::raise_condition (this=0x7f07f4000d90, sql_errno=1205, sqlstate=0x0, level=Sql_state_errno_level::WARN_LEVEL_ERROR, msg=0x7f08116dc3a0 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.h:4400
      #11 0x00005564e5b7f530 in my_message_sql (error=1205, str=0x7f08116dc3a0 "Lock wait timeout exceeded; try restarting transaction", MyFlags=0) at /data/src/10.4/sql/mysqld.cc:3346
      #12 0x00005564e68dd15f in my_error (nr=1205, MyFlags=0) at /data/src/10.4/mysys/my_error.c:125
      #13 0x00005564e5e5f3eb in MDL_context::acquire_lock (this=0x7f07f4000eb0, mdl_request=0x7f08116dc870, lock_wait_timeout=1) at /data/src/10.4/sql/mdl.cc:2365
      #14 0x00005564e6173c16 in MYSQL_BIN_LOG::write (this=0x5564e7479300 <mysql_bin_log>, event_info=0x7f08116dcab0, with_annotate=0x0) at /data/src/10.4/sql/log.cc:6393
      #15 0x00005564e5c5cd1e in THD::binlog_query (this=0x7f07f4000d90, qtype=THD::STMT_QUERY_TYPE, query_arg=0x7f07f4014888 "ALTER TABLE tmp", query_len=15, is_trans=false, direct=false, suppress_use=false, errcode=0) at /data/src/10.4/sql/sql_class.cc:7209
      #16 0x00005564e5da7114 in write_bin_log (thd=0x7f07f4000d90, clear_error=true, query=0x7f07f4014888 "ALTER TABLE tmp", query_length=15, is_trans=false) at /data/src/10.4/sql/sql_table.cc:1987
      #17 0x00005564e5dbe074 in mysql_alter_table (thd=0x7f07f4000d90, new_db=0x7f07f4005560, new_name=0x7f07f4005988, create_info=0x7f08116df330, table_list=0x7f07f4014948, alter_info=0x7f08116df270, order_num=0, order=0x0, ignore=false) at /data/src/10.4/sql/sql_table.cc:9815
      #18 0x00005564e5e59264 in Sql_cmd_alter_table::execute (this=0x7f07f4015010, thd=0x7f07f4000d90) at /data/src/10.4/sql/sql_alter.cc:520
      #19 0x00005564e5ccfd9f in mysql_execute_command (thd=0x7f07f4000d90) at /data/src/10.4/sql/sql_parse.cc:6186
      #20 0x00005564e5cd519d in mysql_parse (thd=0x7f07f4000d90, rawbuf=0x7f07f4014888 "ALTER TABLE tmp", length=15, parser_state=0x7f08116e0550, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7989
      #21 0x00005564e5cc13f9 in dispatch_command (command=COM_QUERY, thd=0x7f07f4000d90, packet=0x7f07f400ace1 "ALTER TABLE tmp", packet_length=15, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1855
      #22 0x00005564e5cbfc61 in do_command (thd=0x7f07f4000d90) at /data/src/10.4/sql/sql_parse.cc:1373
      #23 0x00005564e5e4f9c1 in do_handle_one_connection (connect=0x5564e92ebfd0) at /data/src/10.4/sql/sql_connect.cc:1412
      #24 0x00005564e5e4f70a in handle_one_connection (arg=0x5564e92ebfd0) at /data/src/10.4/sql/sql_connect.cc:1316
      #25 0x00005564e68784e8 in pfs_spawn_thread (arg=0x5564e92381d0) at /data/src/10.4/storage/perfschema/pfs.cc:1869
      #26 0x00007f0817fdb609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #27 0x00007f0817846293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
       
      Query (0x7f07f4014888): ALTER TABLE tmp
       
      Connection ID (thread ID): 5
      Status: NOT_KILLED
      

      The problem was introduced in 10.4.16/10.5.7 by this commit:

      commit 16ea692ed490a1a2c6dc0b855247d5bd02bd2345 (HEAD)
      Author: Monty
      Date:   Mon Sep 7 10:38:12 2020 +0300
       
          MDEV-23586 Mariabackup: GTID saved for replication in 10.4.14 is wrong
          
          MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel
          replication
          
          Fixed by partly reverting MDEV-21953 to put back MDL_BACKUP_COMMIT locking
          before log_and_order.
          
          The original problem for MDEV-21953 was that while a thread was waiting in
          for another threads to commit in 'log_and_order', it had the
          MDL_BACKUP_COMMIT lock. The backup thread was waiting to get the
          MDL_BACKUP_WAIT_COMMIT lock, which blocks all new MDL_BACKUP_COMMIT locks.
          This causes a deadlock as the waited-for thread can never get past the
          MDL_BACKUP_COMMIT lock in ha_commit_trans.
          
          The main part of the bug fix is to release the MDL_BACKUP_COMMIT lock while
          a thread is waiting for other 'previous' threads to commit. This ensures
          that no transactional thread keeps MDL_BACKUP_COMMIT while waiting, which
          ensures that there are no deadlocks anymore.
      

      I can't figure from the commit comment whether the effect with blocking temporary table actions was intentional. The assertion failure obviously wasn't.

      Attachments

        Issue Links

          Activity

            People

              monty Michael Widenius
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.