Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32144

Debug assertion failure w == MAYBE_NOP in mtr_t::memcpy(), trx_undo_write_xid()

Details

    Description

      Elkin reported the following crash in a debug build:

      mariadb-10.6.14 with some changes

      #0  __GI_raise (sig=sig@entry=6) at raise.c:51
      #1  0x00007ffff4b467f1 in __GI_abort () at abort.c:79
      #2  0x00007ffff4b363fa in __assert_fail_base (fmt=0x7ffff4cbd6c0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x555556fff6d9 "w == MAYBE_NOP", file=file@entry=0x555556ffb7a0 "/home2/me/MEP/WTs/10.6/A0/storage/innobase/include/mtr0log.h", line=line@entry=503, function=function@entry=0x555556fff978 "void mtr_t::memcpy(const buf_block_t&, void*, const void*, ulint) [with mtr_t::write_type w = mtr_t::NORMAL; ulint = long unsigned int]") at assert.c:92
      #3  0x00007ffff4b36472 in __GI___assert_fail (assertion=0x555556fff6d9 "w == MAYBE_NOP", file=0x555556ffb7a0 "/home2/me/MEP/WTs/10.6/A0/storage/innobase/include/mtr0log.h", line=503, function=0x555556fff978 "void mtr_t::memcpy(const buf_block_t&, void*, const void*, ulint) [with mtr_t::write_type w = mtr_t::NORMAL; ulint = long unsigned int]") at assert.c:101
      #4  0x0000555556803e64 in mtr_t::memcpy<(mtr_t::write_type)0> (this=0x7fffe8c704a0, b=..., dest=0x7fffea32d3f5, str=0x7fffb403d928, len=2) at mtr0log.h:503
      #5  0x000055555694a218 in trx_undo_write_xid (block=0x7fffe9c74bc0, offset=5051, xid=..., mtr=0x7fffe8c704a0) at trx0undo.cc:642
      #6  0x000055555694ceed in trx_undo_set_state_at_prepare (trx=0x7fffeac72980, undo=0x7fffb403d8f8, rollback=false, mtr=0x7fffe8c704a0) at trx0undo.cc:1539
      #7  0x000055555693f653 in trx_prepare_low (trx=0x7fffeac72980) at trx0trx.cc:1848
      #8  0x000055555693f788 in trx_prepare (trx=0x7fffeac72980) at trx0trx.cc:1869
      #9  0x000055555693f9a7 in trx_prepare_for_mysql (trx=0x7fffeac72980) at trx0trx.cc:1922
      #10 0x00005555566e6d69 in innobase_xa_prepare (hton=0x55555859d768, thd=0x7fffb0016c08, prepare_trx=true) at ha_innodb.cc:17061
      #11 0x00005555562a6d6f in prepare_or_error (ht=0x55555859d768, thd=0x7fffb0016c08, all=true) at handler.cc:1458
      #12 0x00005555562a6f38 in ha_prepare (thd=0x7fffb0016c08) at handler.cc:1504
      #13 0x00005555561d4b7d in trans_xa_prepare (thd=0x7fffb0016c08) at xa.cc:547
      #14 0x0000555555eec1ee in mysql_execute_command (thd=0x7fffb0016c08, is_called_from_prepared_stmt=false) at sql_parse.cc:5874
      #15 0x0000555555ef2e37 in mysql_parse (thd=0x7fffb0016c08, rawbuf=0x7fffb0025a10 "XA PREPARE '9','5',1", length=20, parser_state=0x7fffe8c71310) at sql_parse.cc:8063
      

      I believe that the reason for this is that an identical XA PREPARE statement had been executed in the past.

      This bug does not have any impact on release builds. The purpose of the debug assertion is to catch redundant calls for writing redo log for something that did not actually change.

      Like the debug assertion expression suggests, this can be silenced by passing the appropriate template parameter.

      Attachments

        Issue Links

          Activity

            Roel Roel Van de Paar added a comment - - edited

            FYI, I also ran into this as (only the UniqueID was preserved from this occurrence):

            w == MAYBE_NOP|SIGABRT|trx_undo_write_xid|trx_undo_set_state_at_prepare|trx_prepare_low|trx_prepare
            

            Which is similar to the above but slightly different. This was in a debug 11.3 build of 6 Feb 2024.
            The partially reduced testcase at that time was:

            #mysqld options in use at occurrence time: --no-defaults --max_allowed_packet=33554432 --maximum-bulk_insert_buffer_size=1M --maximum-join_buffer_size=1M --maximum-max_heap_table_size=1M --maximum-max_join_size=1M --maximum-myisam_max_sort_file_size=1M --maximum-myisam_mmap_size=1M --maximum-myisam_sort_buffer_size=1M --maximum-optimizer_trace_max_mem_size=1M --maximum-preload_buffer_size=1M --maximum-query_alloc_block_size=1M --maximum-query_prealloc_size=1M --maximum-range_alloc_block_size=1M --maximum-read_buffer_size=1M --maximum-read_rnd_buffer_size=1M --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --sql_mode= --innodb_file_per_table=1 --innodb_flush_method=O_DIRECT --innodb_stats_persistent=off --loose-idle_write_transaction_timeout=0 --loose-idle_transaction_timeout=0 --loose-idle_readonly_transaction_timeout=0 --connect_timeout=60 --interactive_timeout=28800 --slave_net_timeout=60 --net_read_timeout=30 --net_write_timeout=60 --wait_timeout=28800 --lock-wait-timeout=86400 --innodb-lock-wait-timeout=50 --log_output=FILE --log-bin --log_bin_trust_function_creators=1 --loose-max-statement-time=30 --loose-debug_assert_on_not_freed_memory=0 --innodb-buffer-pool-size=300M"
            CREATE TABLE t0 (a int(0) auto_increment,b text,c varchar(0),PRIMARY KEY (a),FULLTEXT KEY a(b,c)) engine=innodb;#NOERROR;
            xa start 0x0,0x0,0xb;#NOERROR;
            INSERT INTO t0 VALUES (0,0,NULL);#NOERROR;
            insert into t0 values (NULL,NULL,0);#ERRONa ne odgovara broju vrednosti u slogu 0;
            INSERT INTO t0 VALUES(NULL,NULL,0);#ERRONa ne odgovara broju vrednosti u slogu 0;
            insert INTO t0 (b) values(NULL);#ERRONa NULL u NULL;
            DELETE FROM t0;#NOERROR;
            SET @@session.max_insert_delayed_threads=0;#NOERROR;
            replace INTO t0 values (),();#NOERROR;
            INSERT DELAYED INTO t0 VALUES(NULL,NULL,NULL);#NOERROR;
            INSERT INTO t0 VALUES (NULL,NULL,NULL);#ERRONa ne odgovara broju vrednosti u slogu 0;
            INSERT INTO t0 VALUES (NULL,NULL,0);#ERRONa NULL u NULL;
            INSERT INTO t0 (a) SELECT 0 FROM t0;#ERROR: 0 - Dupliran unos NULL za klju\0D NULL;
            insert into t0 (a) values (NULL);#NOERROR;
            INSERT INTO t0 VALUES(0,@short_value,DEFAULT);#NOERROR;
            DELETE FROM t0;#NOERROR;
            INSERT INTO t0 VALUES (),();#ERROR: 0 - Out of range value for c NULL at row 0;
            XA END 0x0,0x0,0xb;#NOERROR;
            XA PREPARE 0x0,0x0,0xb;#NOERROR;
            SHUTDOWN;
            

            The issue is no longer repeatable on 11.4 debug at b86a2f03b6a9a0b5e222fb2f52b07c85c491479e.

            Roel Roel Van de Paar added a comment - - edited FYI, I also ran into this as (only the UniqueID was preserved from this occurrence): w == MAYBE_NOP|SIGABRT|trx_undo_write_xid|trx_undo_set_state_at_prepare|trx_prepare_low|trx_prepare Which is similar to the above but slightly different. This was in a debug 11.3 build of 6 Feb 2024. The partially reduced testcase at that time was: #mysqld options in use at occurrence time : --no-defaults --max_allowed_packet=33554432 --maximum-bulk_insert_buffer_size=1M --maximum-join_buffer_size=1M --maximum-max_heap_table_size=1M --maximum-max_join_size=1M --maximum-myisam_max_sort_file_size=1M --maximum-myisam_mmap_size=1M --maximum-myisam_sort_buffer_size=1M --maximum-optimizer_trace_max_mem_size=1M --maximum-preload_buffer_size=1M --maximum-query_alloc_block_size=1M --maximum-query_prealloc_size=1M --maximum-range_alloc_block_size=1M --maximum-read_buffer_size=1M --maximum-read_rnd_buffer_size=1M --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --sql_mode= --innodb_file_per_table=1 --innodb_flush_method=O_DIRECT --innodb_stats_persistent=off --loose-idle_write_transaction_timeout=0 --loose-idle_transaction_timeout=0 --loose-idle_readonly_transaction_timeout=0 --connect_timeout=60 --interactive_timeout=28800 --slave_net_timeout=60 --net_read_timeout=30 --net_write_timeout=60 --wait_timeout=28800 --lock-wait-timeout=86400 --innodb-lock-wait-timeout=50 --log_output=FILE --log-bin --log_bin_trust_function_creators=1 --loose-max-statement-time=30 --loose-debug_assert_on_not_freed_memory=0 --innodb-buffer-pool-size=300M" CREATE TABLE t0 (a int (0) auto_increment,b text,c varchar (0), PRIMARY KEY (a),FULLTEXT KEY a(b,c)) engine=innodb;#NOERROR; xa start 0x0,0x0,0xb;#NOERROR; INSERT INTO t0 VALUES (0,0, NULL );#NOERROR; insert into t0 values ( NULL , NULL ,0);#ERRONa ne odgovara broju vrednosti u slogu 0; INSERT INTO t0 VALUES ( NULL , NULL ,0);#ERRONa ne odgovara broju vrednosti u slogu 0; insert INTO t0 (b) values ( NULL );#ERRONa NULL u NULL ; DELETE FROM t0;#NOERROR; SET @@session.max_insert_delayed_threads=0;#NOERROR; replace INTO t0 values (),();#NOERROR; INSERT DELAYED INTO t0 VALUES ( NULL , NULL , NULL );#NOERROR; INSERT INTO t0 VALUES ( NULL , NULL , NULL );#ERRONa ne odgovara broju vrednosti u slogu 0; INSERT INTO t0 VALUES ( NULL , NULL ,0);#ERRONa NULL u NULL ; INSERT INTO t0 (a) SELECT 0 FROM t0;#ERROR: 0 - Dupliran unos NULL za klju\0D NULL ; insert into t0 (a) values ( NULL );#NOERROR; INSERT INTO t0 VALUES (0,@short_value, DEFAULT );#NOERROR; DELETE FROM t0;#NOERROR; INSERT INTO t0 VALUES (),();#ERROR: 0 - Out of range value for c NULL at row 0; XA END 0x0,0x0,0xb;#NOERROR; XA PREPARE 0x0,0x0,0xb;#NOERROR; SHUTDOWN; The issue is no longer repeatable on 11.4 debug at b86a2f03b6a9a0b5e222fb2f52b07c85c491479e.

            The debug assertion failure merely indicates that there was an attempt to modify something with an assumption that the data was actually changed. In non-debug builds, nothing incorrect should happen in this case.

            While Elkin never supplied a reproducible test case for triggering the assertion failure, I suspect that it was related to reusing the same XA 2-phase-commit transaction identifier (XID) for multiple transactions. In the SQL snippet provided by Roel the trigger might be that the encoding of the XID matches something that may already have written at that spot of the undo log page by an earlier operation.

            marko Marko Mäkelä added a comment - The debug assertion failure merely indicates that there was an attempt to modify something with an assumption that the data was actually changed. In non-debug builds, nothing incorrect should happen in this case. While Elkin never supplied a reproducible test case for triggering the assertion failure, I suspect that it was related to reusing the same XA 2-phase-commit transaction identifier (XID) for multiple transactions. In the SQL snippet provided by Roel the trigger might be that the encoding of the XID matches something that may already have written at that spot of the undo log page by an earlier operation.
            Elkin Andrei Elkin added a comment -

            marko, iirc the assert was surfaced at running sqlbench script on my local branch. And it's gone after I applied your fixes.

            Elkin Andrei Elkin added a comment - marko , iirc the assert was surfaced at running sqlbench script on my local branch. And it's gone after I applied your fixes.

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.