Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25334

FTWRL/Backup blocks DDL on temporary tables with binlog enabled, assertion fails in Diagnostics_area::set_error_status

    XMLWordPrintable

    Details

      Description

      The two test cases below differ only in FTWRL vs BACKUP STAGE x 2 and cause the same effect, both on release and debug builds.

      FTWRL

      --source include/have_binlog_format_mixed_or_statement.inc
       
      CREATE TEMPORARY TABLE tmp (a INT);
       
      --connect (con1,localhost,root,,)
      FLUSH TABLES WITH READ LOCK;
       
      --connection default
      SET lock_wait_timeout= 1;
      ALTER TABLE tmp;
      

      BACKUP STAGE

      --source include/have_binlog_format_mixed_or_statement.inc
       
      CREATE TEMPORARY TABLE tmp (a INT);
       
      --connect (con1,localhost,root,,)
      BACKUP STAGE START;
      BACKUP STAGE BLOCK_COMMIT;
       
      --connection default
      SET lock_wait_timeout= 1;
      ALTER TABLE tmp;
      

      The backup lock taken by con1 blocks ALTER on a temporary table in the connection default and makes it eventually fail with a timeout. It affects debug and non-debug builds alike.

      10.4 880baedc

      connection default;
      SET lock_wait_timeout= 1;
      ALTER TABLE tmp;
      bug.t                                    [ fail ]
              Test ended at 2021-04-03 20:00:05
       
      CURRENT_TEST: bug.t
      mysqltest: At line 12: query 'ALTER TABLE tmp' failed: 1205: Lock wait timeout exceeded; try restarting transaction
      

      Debug builds further abort with an assertion failure in diagnostics area:

      10.4 880baedc debug

      mysqld: /data/src/10.4/sql/sql_error.cc:445: void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*): Assertion `! is_set() || m_can_overwrite_status' failed.
      210403 20:02:19 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f081775af36 in __GI___assert_fail (assertion=0x5564e69c1468 "! is_set() || m_can_overwrite_status", file=0x5564e69c1200 "/data/src/10.4/sql/sql_error.cc", line=445, function=0x5564e69c1490 "void Diagnostics_area::set_error_status(uint, const char*, const char*, const Sql_user_condition_identity&, const Sql_condition*)") at assert.c:101
      #8  0x00005564e5c71870 in Diagnostics_area::set_error_status (this=0x7f07f40067a8, sql_errno=1205, message=0x7f08116dc3a0 "Lock wait timeout exceeded; try restarting transaction", sqlstate=0x5564e69e6e6d "HY000", ucid=..., error_condition=0x0) at /data/src/10.4/sql/sql_error.cc:445
      #9  0x00005564e5c4b83f in THD::raise_condition (this=0x7f07f4000d90, sql_errno=1205, sqlstate=0x5564e69e6e6d "HY000", level=Sql_state_errno_level::WARN_LEVEL_ERROR, ucid=..., msg=0x7f08116dc3a0 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.cc:1103
      #10 0x00005564e5b8cb50 in THD::raise_condition (this=0x7f07f4000d90, sql_errno=1205, sqlstate=0x0, level=Sql_state_errno_level::WARN_LEVEL_ERROR, msg=0x7f08116dc3a0 "Lock wait timeout exceeded; try restarting transaction") at /data/src/10.4/sql/sql_class.h:4400
      #11 0x00005564e5b7f530 in my_message_sql (error=1205, str=0x7f08116dc3a0 "Lock wait timeout exceeded; try restarting transaction", MyFlags=0) at /data/src/10.4/sql/mysqld.cc:3346
      #12 0x00005564e68dd15f in my_error (nr=1205, MyFlags=0) at /data/src/10.4/mysys/my_error.c:125
      #13 0x00005564e5e5f3eb in MDL_context::acquire_lock (this=0x7f07f4000eb0, mdl_request=0x7f08116dc870, lock_wait_timeout=1) at /data/src/10.4/sql/mdl.cc:2365
      #14 0x00005564e6173c16 in MYSQL_BIN_LOG::write (this=0x5564e7479300 <mysql_bin_log>, event_info=0x7f08116dcab0, with_annotate=0x0) at /data/src/10.4/sql/log.cc:6393
      #15 0x00005564e5c5cd1e in THD::binlog_query (this=0x7f07f4000d90, qtype=THD::STMT_QUERY_TYPE, query_arg=0x7f07f4014888 "ALTER TABLE tmp", query_len=15, is_trans=false, direct=false, suppress_use=false, errcode=0) at /data/src/10.4/sql/sql_class.cc:7209
      #16 0x00005564e5da7114 in write_bin_log (thd=0x7f07f4000d90, clear_error=true, query=0x7f07f4014888 "ALTER TABLE tmp", query_length=15, is_trans=false) at /data/src/10.4/sql/sql_table.cc:1987
      #17 0x00005564e5dbe074 in mysql_alter_table (thd=0x7f07f4000d90, new_db=0x7f07f4005560, new_name=0x7f07f4005988, create_info=0x7f08116df330, table_list=0x7f07f4014948, alter_info=0x7f08116df270, order_num=0, order=0x0, ignore=false) at /data/src/10.4/sql/sql_table.cc:9815
      #18 0x00005564e5e59264 in Sql_cmd_alter_table::execute (this=0x7f07f4015010, thd=0x7f07f4000d90) at /data/src/10.4/sql/sql_alter.cc:520
      #19 0x00005564e5ccfd9f in mysql_execute_command (thd=0x7f07f4000d90) at /data/src/10.4/sql/sql_parse.cc:6186
      #20 0x00005564e5cd519d in mysql_parse (thd=0x7f07f4000d90, rawbuf=0x7f07f4014888 "ALTER TABLE tmp", length=15, parser_state=0x7f08116e0550, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7989
      #21 0x00005564e5cc13f9 in dispatch_command (command=COM_QUERY, thd=0x7f07f4000d90, packet=0x7f07f400ace1 "ALTER TABLE tmp", packet_length=15, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1855
      #22 0x00005564e5cbfc61 in do_command (thd=0x7f07f4000d90) at /data/src/10.4/sql/sql_parse.cc:1373
      #23 0x00005564e5e4f9c1 in do_handle_one_connection (connect=0x5564e92ebfd0) at /data/src/10.4/sql/sql_connect.cc:1412
      #24 0x00005564e5e4f70a in handle_one_connection (arg=0x5564e92ebfd0) at /data/src/10.4/sql/sql_connect.cc:1316
      #25 0x00005564e68784e8 in pfs_spawn_thread (arg=0x5564e92381d0) at /data/src/10.4/storage/perfschema/pfs.cc:1869
      #26 0x00007f0817fdb609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #27 0x00007f0817846293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
       
      Query (0x7f07f4014888): ALTER TABLE tmp
       
      Connection ID (thread ID): 5
      Status: NOT_KILLED
      

      The problem was introduced in 10.4.16/10.5.7 by this commit:

      commit 16ea692ed490a1a2c6dc0b855247d5bd02bd2345 (HEAD)
      Author: Monty
      Date:   Mon Sep 7 10:38:12 2020 +0300
       
          MDEV-23586 Mariabackup: GTID saved for replication in 10.4.14 is wrong
          
          MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel
          replication
          
          Fixed by partly reverting MDEV-21953 to put back MDL_BACKUP_COMMIT locking
          before log_and_order.
          
          The original problem for MDEV-21953 was that while a thread was waiting in
          for another threads to commit in 'log_and_order', it had the
          MDL_BACKUP_COMMIT lock. The backup thread was waiting to get the
          MDL_BACKUP_WAIT_COMMIT lock, which blocks all new MDL_BACKUP_COMMIT locks.
          This causes a deadlock as the waited-for thread can never get past the
          MDL_BACKUP_COMMIT lock in ha_commit_trans.
          
          The main part of the bug fix is to release the MDL_BACKUP_COMMIT lock while
          a thread is waiting for other 'previous' threads to commit. This ensures
          that no transactional thread keeps MDL_BACKUP_COMMIT while waiting, which
          ensures that there are no deadlocks anymore.
      

      I can't figure from the commit comment whether the effect with blocking temporary table actions was intentional. The assertion failure obviously wasn't.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              monty Michael Widenius
              Reporter:
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: