Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32372

Assertion `thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)' failed in close_thread_tables upon ALTER

Details

    Description

      The following SQL, when executed in a m/s setup and using CLI on the master:

      # mysqld options required for replay: --slave-parallel-threads=65
      ALTER TABLE mysql.gtid_slave_pos DROP PRIMARY KEY;
      CREATE TABLE t (c1 INT) ENGINE=InnoDB;
      INSERT INTO t VALUES (1);
      ALTER TABLE t ADD COLUMN c INT;
      

      Will, sporadically, lead to a crash on the slave, but only when --slave-parallel-threads is set.

      11.3.0 905c3d61e18ae6222d0d195c43d335046eec65d9 (Debug)

      Core was generated by `/test/MD270923-mariadb-11.3.0-linux-x86_64-dbg/bin/mariadbd --no-defaults --sla'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=22820648896064)
          at ./nptl/pthread_kill.c:44
      [Current thread is 1 (Thread 0x14c158abc640 (LWP 1360026))]
      (gdb) bt
      #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=22820648896064) at ./nptl/pthread_kill.c:44
      #1  __pthread_kill_internal (signo=6, threadid=22820648896064) at ./nptl/pthread_kill.c:78
      #2  __GI___pthread_kill (threadid=22820648896064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
      #3  0x000014c181c42476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
      #4  0x000014c181c287f3 in __GI_abort () at ./stdlib/abort.c:79
      #5  0x000014c181c2871b in __assert_fail_base (fmt=0x14c181ddd150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5616d920d850 "thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)", file=0x5616d920d1d0 "/test/11.3_dbg/sql/sql_base.cc", line=833, function=<optimized out>) at ./assert/assert.c:92
      #6  0x000014c181c39e96 in __GI___assert_fail (assertion=0x5616d920d850 "thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)", file=0x5616d920d1d0 "/test/11.3_dbg/sql/sql_base.cc", line=833, function=0x5616d920f9c7 "int close_thread_tables(THD*)") at ./assert/assert.c:101
      #7  0x00005616d8599495 in close_thread_tables (thd=thd@entry=0x14bfc0000f98) at /test/11.3_dbg/sql/sql_base.cc:833
      #8  0x00005616d85994e1 in close_thread_tables_for_query (thd=thd@entry=0x14bfc0000f98) at /test/11.3_dbg/sql/sql_base.cc:792
      #9  0x00005616d8624b23 in mysql_execute_command (thd=thd@entry=0x14bfc0000f98, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/11.3_dbg/sql/sql_parse.cc:5824
      #10 0x00005616d86255f7 in mysql_parse (thd=0x14bfc0000f98, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x14c158abb690) at /test/11.3_dbg/sql/sql_parse.cc:7744
      #11 0x00005616d8a9ebc3 in Query_log_event::do_apply_event (this=0x14c0d006fd38, rgi=0x14c0d006f2e0, query_arg=0x14c0d006f253 "ALTER TABLE t ADD COLUMN c INT", q_len_arg=<optimized out>) at /test/11.3_dbg/sql/sql_class.h:247
      #12 0x00005616d8a9f45c in Query_log_event::do_apply_event (this=<optimized out>, rgi=<optimized out>) at /test/11.3_dbg/sql/log_event_server.cc:1577
      #13 0x00005616d8a8f9f5 in Log_event::apply_event (this=this@entry=0x14c0d006fd38, rgi=rgi@entry=0x14c0d006f2e0) at /test/11.3_dbg/sql/log_event.cc:3875
      #14 0x00005616d8547c3d in apply_event_and_update_pos_apply (ev=ev@entry=0x14c0d006fd38, thd=thd@entry=0x14bfc0000f98, rgi=rgi@entry=0x14c0d006f2e0, reason=reason@entry=0) at /test/11.3_dbg/sql/slave.cc:3888
      #15 0x00005616d8551da7 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x14c0d006fd38, thd=thd@entry=0x14bfc0000f98, rgi=rgi@entry=0x14c0d006f2e0) at /test/11.3_dbg/sql/slave.cc:4085
      #16 0x00005616d8822536 in rpt_handle_event (qev=qev@entry=0x14c0d006fe98, rpt=rpt@entry=0x14c0d0026ce0) at /test/11.3_dbg/sql/rpl_parallel.cc:64
      #17 0x00005616d88270da in handle_rpl_parallel_thread (arg=arg@entry=0x14c0d0026ce0) at /test/11.3_dbg/sql/rpl_parallel.cc:1506
      #18 0x00005616d8bcc20a in pfs_spawn_thread (arg=0x14c0d00389d8) at /test/11.3_dbg/storage/perfschema/pfs.cc:2201
      #19 0x000014c181c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
      #20 0x000014c181d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      

      It is observed somewhat regularly in replicating testing. The following testcase variation seems to be reproduce the issue a bit less sporadically on 10.4 and 10.5, but it is hard to say due to the sporadic nature of the bug.

      # mysqld options required for replay: --slave-parallel-threads=65
      ALTER TABLE mysql.gtid_slave_pos DROP PRIMARY KEY;
      CREATE TABLE t(c1 INT) DEFAULT CHARSET=ujis;
      INSERT INTO t VALUES (0);
      ALTER TABLE t ADD COLUMN c INT;
      

      10.4.32 47f0135d7aba57d160d1d95ff7f1f6b7bb07c2fb (Debug)

      Core was generated by `/test/MD270923-mariadb-10.4.32-linux-x86_64-dbg/bin/mariadbd --no-defaults --gt'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=23430516848192)
          at ./nptl/pthread_kill.c:44
      [Current thread is 1 (Thread 0x154f57a23640 (LWP 2459976))]
      (gdb) bt
      #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=23430516848192) at ./nptl/pthread_kill.c:44
      #1  __pthread_kill_internal (signo=6, threadid=23430516848192) at ./nptl/pthread_kill.c:78
      #2  __GI___pthread_kill (threadid=23430516848192, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
      #3  0x0000154fcd642476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
      #4  0x0000154fcd6287f3 in __GI_abort () at ./stdlib/abort.c:79
      #5  0x0000154fcd62871b in __assert_fail_base (fmt=0x154fcd7dd150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55f19c2d78e0 "thd->transaction.stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)", file=0x55f19c2d7230 "/test/10.4_dbg/sql/sql_base.cc", line=897, function=<optimized out>) at ./assert/assert.c:92
      #6  0x0000154fcd639e96 in __GI___assert_fail (assertion=0x55f19c2d78e0 "thd->transaction.stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)", file=0x55f19c2d7230 "/test/10.4_dbg/sql/sql_base.cc", line=897, function=0x55f19c2d78c0 "void close_thread_tables(THD*)") at ./assert/assert.c:101
      #7  0x000055f19b6a3261 in close_thread_tables (thd=thd@entry=0x154e58000d38) at /test/10.4_dbg/sql/sql_base.cc:897
      #8  0x000055f19b723949 in mysql_execute_command (thd=thd@entry=0x154e58000d38) at /test/10.4_dbg/sql/sql_parse.cc:6283
      #9  0x000055f19b7252ce in mysql_parse (thd=0x154e58000d38, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x154f57a22830, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.4_dbg/sql/sql_parse.cc:8012
      #10 0x000055f19baffc44 in Query_log_event::do_apply_event (this=0x154f64030078, rgi=0x154f6402f550, query_arg=0x154f640301b3 "ALTER TABLE t ADD COLUMN c INT", q_len_arg=<optimized out>) at /test/10.4_dbg/sql/sql_class.h:227
      #11 0x000055f19bb0060a in Query_log_event::do_apply_event (this=<optimized out>, rgi=<optimized out>) at /test/10.4_dbg/sql/log_event.cc:5422
      #12 0x000055f19b649fa6 in Log_event::apply_event (rgi=0x154f6402f550, this=0x154f64030078) at /test/10.4_dbg/sql/log_event.h:1492
      #13 apply_event_and_update_pos_apply (ev=ev@entry=0x154f64030078, thd=thd@entry=0x154e58000d38, rgi=rgi@entry=0x154f6402f550, reason=reason@entry=0) at /test/10.4_dbg/sql/slave.cc:3820
      #14 0x000055f19b6549f9 in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x154f64030078, thd=thd@entry=0x154e58000d38, rgi=rgi@entry=0x154f6402f550) at /test/10.4_dbg/sql/slave.cc:4006
      #15 0x000055f19b8afb09 in rpt_handle_event (qev=qev@entry=0x154f64030248, rpt=rpt@entry=0x154f64012bb8) at /test/10.4_dbg/sql/rpl_parallel.cc:64
      #16 0x000055f19b8b5a26 in handle_rpl_parallel_thread (arg=0x154f64012bb8) at /test/10.4_dbg/sql/rpl_parallel.cc:1480
      #17 0x0000154fcd694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
      #18 0x0000154fcd726a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      

      In spite of trying various ways, the issue is not reproducible in MTR. Somehow the timing seems different in MTR. The following testcase for example:

      --source include/have_binlog_format_row.inc
      --source include/master-slave.inc
      --source include/have_innodb.inc
      --connection slave
      STOP SLAVE;
      SET GLOBAL slave_parallel_threads=65;
      START SLAVE;
      SELECT SLEEP(2);
      --connection master
      ALTER TABLE mysql.gtid_slave_pos DROP PRIMARY KEY;
      CREATE TABLE t(c1 INT)DEFAULT CHARSET=ujis;
      INSERT INTO t VALUES(0);
      ALTER TABLE t ADD COLUMN c INT;
      --source include/rpl_end.inc
      

      Simply results in the expected (these type of errors also show prior to the assertion reproducing when using the CLI):

      11.3.0 905c3d61e18ae6222d0d195c43d335046eec65d9

      2023-10-07 16:16:23 77 [ERROR] Incorrect definition of table mysql.gtid_slave_pos: missing primary key.
      

      Without the crash. Passing the parallel threads option directly to mtr makes no difference, and MTR does not seem to have a way to specify slave-only mariadbd options. UniqueID's/stacks observed thus far:

      SIGABRT|close_thread_tables|close_thread_tables_for_query|mysql_execute_command|mysql_parse  # 11.3
      SIGABRT|close_thread_tables|mysql_execute_command|mysql_parse|Query_log_event::do_apply_event  # 10.4
      thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)|SIGABRT|close_thread_tables|close_thread_tables_for_query|mysql_execute_command|mysql_parse  # 11.5
      thd->transaction.stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)|SIGABRT|close_thread_tables|mysql_execute_command|mysql_parse|Query_log_event::do_apply_event  # 10.4
      thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)|SIGABRT|close_thread_tables|mysql_execute_command|mysql_parse|Query_log_event::do_apply_event  # 10.6
      

      Attachments

        Issue Links

          Activity

            Roel Roel Van de Paar added a comment - - edited

            Also seen during MDEV-33426 testing as:

            thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)|SIGABRT|__vfwscanf_internal|Backtrace stopped: Cannot access memory at address|
            

            GDB:

            11.4.0 faf48c262fd3f350b91482865b1bc1d8a037502b (Debug)

            Core was generated by `/test/MDEV33426_MD160224-mariadb-11.4.0-linux-x86_64-dbg/bin/mariadbd --no-defa'.
            Program terminated with signal SIGABRT, Aborted.
            #0  0x0000145f302969fc in __vfwscanf_internal (s=<optimized out>, format=<optimized out>, argptr=<optimized out>, mode_flags=<optimized out>)at ./stdio-common/vfscanf-internal.c:2472
            [Current thread is 1 (LWP 1502930)]
            (gdb) bt
            #0  0x0000145f302969fc in __vfwscanf_internal (s=<optimized out>, format=<optimized out>, argptr=<optimized out>, mode_flags=<optimized out>)at ./stdio-common/vfscanf-internal.c:2472
            Backtrace stopped: Cannot access memory at address 0x16eeda
            

            So memory corruption in connection with this bug seems likely. In this case the partially reduced testcase was (for reference, not replay):

            ALTER TABLE mysql.gtid_slave_pos ENGINE=Aria;
            CREATE TABLE t1 (c1 TINYINT NULL, c2 CHAR(5)) PARTITION BY LINEAR KEY(c1) PARTITIONS 99;
            INSERT INTO t1 VALUES (10,2), (20,3), (30,4), (40,5), (50,10);
            UPDATE t1 SET c1='v"*s:k(t_WUHbtu0/F3x/"LgJ5U6*LC3(jbsl&c^hm5?1svxw#GTUm/jg(aXz$zw?mv@=%"^C<]B#oZ,{G,,5VJeQEmh4Cy,E%KPtew{fydG9d"J?**`m&S#[ig]]U]3~`0AiFzMXkJ`Q5ng}umM*At;(up_]%0Ofk!/)e{r#A@8/;ReSUZfU..mrX;60txJ3f@boLC--kb]l6HRK,`f(h,B%Xz~I7dy86msuc/mq;tj%C~c4P.Z]S"%{d_,"=O);Y[)@nVW]vc=5ts1>+cVUx%FPUCkU$@$hUU1^q`!dV8+AoDSAe.]Ufk}zxgREw=)aH$UXmUtOB+rCYjB-4^kdFxSb*9<LWBy#uqvcuQuw%.vU=uAbFLpD$08(&jD(`)8B6G,E(<.S#k=x^-h7Z25)V+Y$>`GHhSc5@h*sjGd7&"WobF>Q9Rgso~`9*SQ~7?<y+r!JW.v/oMq+LW+4g]T>Ew58SgDW>Y$H`C4BTtO`-tC26Y3FvUCp3O.h1:bk3[R^i`%*1UgP=L<m1&_x:#w`ztl^`>)WeE3UR%IV2=5bsJPKG6}JcRsFW^3r!o,1EJS5:J1Yxv]tHE1/GqSY!Qu`FL`dzv)}(r"&LzVrsofkdEl`gCzRgt+:HmkkZJ!6*US-%D^X[4p5:qU]1j"?N;IfaJv6/<CQ5$tB8zpLY)KT665P85<Jl*l]sBhIAn"0?!h0+=6o2%HgPnlExb_.~SkPm$+.;yP41`^3a8_YMO9oyvvKKj_v;b?B/xYnVK(@4lq97ZuMn,DE~lNCh8"2nq4dAM^X=7iwBR{d^zBe7m*ics=/*HG~s09~v+51HYESnN<-EB-L<,uWQsx1j_A^[h01v7viT6T/y,;)jMEF7d/EEP6;L-?rD.%KO(B6%7#__doDzQ7_?)lOvTaco@+,/Q[OUCh+$~B{N^xd7Z{c5>+hbqomO';#ERROR: 1054 - \5217 'c1' \306F 'field list' \306B\306F\3042\308A\307E\305B\3093\3002#NOERROR
            ALTER TABLE t1 ADD b INT;
            

            Roel Roel Van de Paar added a comment - - edited Also seen during MDEV-33426 testing as: thd->transaction->stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)|SIGABRT|__vfwscanf_internal|Backtrace stopped: Cannot access memory at address| GDB: 11.4.0 faf48c262fd3f350b91482865b1bc1d8a037502b (Debug) Core was generated by `/test/MDEV33426_MD160224-mariadb-11.4.0-linux-x86_64-dbg/bin/mariadbd --no-defa'. Program terminated with signal SIGABRT, Aborted. #0 0x0000145f302969fc in __vfwscanf_internal (s=<optimized out>, format=<optimized out>, argptr=<optimized out>, mode_flags=<optimized out>)at ./stdio-common/vfscanf-internal.c:2472 [Current thread is 1 (LWP 1502930)] (gdb) bt #0 0x0000145f302969fc in __vfwscanf_internal (s=<optimized out>, format=<optimized out>, argptr=<optimized out>, mode_flags=<optimized out>)at ./stdio-common/vfscanf-internal.c:2472 Backtrace stopped: Cannot access memory at address 0x16eeda So memory corruption in connection with this bug seems likely. In this case the partially reduced testcase was (for reference, not replay): ALTER TABLE mysql.gtid_slave_pos ENGINE=Aria; CREATE TABLE t1 (c1 TINYINT NULL , c2 CHAR (5)) PARTITION BY LINEAR KEY (c1) PARTITIONS 99; INSERT INTO t1 VALUES (10,2), (20,3), (30,4), (40,5), (50,10); UPDATE t1 SET c1= 'v"*s:k(t_WUHbtu0/F3x/"LgJ5U6*LC3(jbsl&c^hm5?1svxw#GTUm/jg(aXz$zw?mv@=%"^C<]B#oZ,{G,,5VJeQEmh4Cy,E%KPtew{fydG9d"J?**`m&S#[ig]]U]3~`0AiFzMXkJ`Q5ng}umM*At;(up_]%0Ofk!/)e{r#A@8/;ReSUZfU..mrX;60txJ3f@boLC--kb]l6HRK,`f(h,B%Xz~I7dy86msuc/mq;tj%C~c4P.Z]S"%{d_,"=O);Y[)@nVW]vc=5ts1>+cVUx%FPUCkU$@$hUU1^q`!dV8+AoDSAe.]Ufk}zxgREw=)aH$UXmUtOB+rCYjB-4^kdFxSb*9<LWBy#uqvcuQuw%.vU=uAbFLpD$08(&jD(`)8B6G,E(<.S#k=x^-h7Z25)V+Y$>`GHhSc5@h*sjGd7&"WobF>Q9Rgso~`9*SQ~7?<y+r!JW.v/oMq+LW+4g]T>Ew58SgDW>Y$H`C4BTtO`-tC26Y3FvUCp3O.h1:bk3[R^i`%*1UgP=L<m1&_x:#w`ztl^`>)WeE3UR%IV2=5bsJPKG6}JcRsFW^3r!o,1EJS5:J1Yxv]tHE1/GqSY!Qu`FL`dzv)}(r"&LzVrsofkdEl`gCzRgt+:HmkkZJ!6*US-%D^X[4p5:qU]1j"?N;IfaJv6/<CQ5$tB8zpLY)KT665P85<Jl*l]sBhIAn"0?!h0+=6o2%HgPnlExb_.~SkPm$+.;yP41`^3a8_YMO9oyvvKKj_v;b?B/xYnVK(@4lq97ZuMn,DE~lNCh8"2nq4dAM^X=7iwBR{d^zBe7m*ics=/*HG~s09~v+51HYESnN<-EB-L<,uWQsx1j_A^[h01v7viT6T/y,;)jMEF7d/EEP6;L-?rD.%KO(B6%7#__doDzQ7_?)lOvTaco@+,/Q[OUCh+$~B{N^xd7Z{c5>+hbqomO' ;#ERROR: 1054 - \5217 'c1' \306F 'field list' \306B\306F\3042\308A\307E\305B\3093\3002#NOERROR ALTER TABLE t1 ADD b INT ;
            Roel Roel Van de Paar added a comment - - edited

            The issue reproduces readily with MBR, RBR, SBR. An easy way to repeat is to set parallel threads to 65 on a standard m/s setup, then execute on the master using the CLI:

            ALTER TABLE mysql.gtid_slave_pos DROP PRIMARY KEY;
            

            Then loop:

            CREATE TABLE t (c1 INT) ENGINE=InnoDB;
            INSERT INTO t VALUES (1);
            ALTER TABLE t ADD COLUMN c INT;
            DROP TABLE t;
            

            Until the slave crashes. 20 Orso times should be plenty.

            This same method does not work for the secondary original testcase.

            Roel Roel Van de Paar added a comment - - edited The issue reproduces readily with MBR, RBR, SBR. An easy way to repeat is to set parallel threads to 65 on a standard m/s setup, then execute on the master using the CLI: ALTER TABLE mysql.gtid_slave_pos DROP PRIMARY KEY ; Then loop: CREATE TABLE t (c1 INT ) ENGINE=InnoDB; INSERT INTO t VALUES (1); ALTER TABLE t ADD COLUMN c INT ; DROP TABLE t; Until the slave crashes. 20 Orso times should be plenty. This same method does not work for the secondary original testcase.

            It is plausible that debugging MDEV-34010 will assist with resolving this bug.

            Roel Roel Van de Paar added a comment - It is plausible that debugging MDEV-34010 will assist with resolving this bug.

            People

              Elkin Andrei Elkin
              Roel Roel Van de Paar
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.