Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13820

trx_id_check() fails during row_log_table_apply()

Details

    Description

      10.3 3722372ae56b7ea90

      mysqld: /data/src/10.3/storage/innobase/include/trx0sys.h:186: bool trx_id_check(const void*, trx_id_t): Assertion `id == 0 || id > trx_id' failed.
      171024 20:59:53 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f00bc64bee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x00005639db364f32 in trx_id_check (db_trx_id=0x7f00b6258082, trx_id=1287) at /data/src/10.3/storage/innobase/include/trx0sys.h:186
      #9  0x00005639db384c68 in row_log_table_apply_delete (thr=0x7f006c14b160, trx_id_col=1, mrec=0x7f00b40df01f "\200", moffsets=0x7f006c17d230, offsets_heap=0x7f006c0645d0, heap=0x7f006c060540, log=0x7f006c040400, save_ext=0x0, ext_size=0) at /data/src/10.3/storage/innobase/row/row0log.cc:2065
      #10 0x00005639db3869b9 in row_log_table_apply_op (thr=0x7f006c14b160, new_trx_id_col=1, dup=0x7f00b426d560, error=0x7f00b426d30c, offsets_heap=0x7f006c0645d0, heap=0x7f006c060540, mrec=0x7f00b40df01f "\200", mrec_end=0x7f00b40df030 "", offsets=0x7f006c17d230) at /data/src/10.3/storage/innobase/row/row0log.cc:2624
      #11 0x00005639db388ca8 in row_log_table_apply_ops (thr=0x7f006c14b160, dup=0x7f00b426d560, stage=0x7f006c17b350) at /data/src/10.3/storage/innobase/row/row0log.cc:3177
      #12 0x00005639db389267 in row_log_table_apply (thr=0x7f006c14b160, old_table=0x7f006c093d88, table=0x7f006c152bd0, stage=0x7f006c17b350) at /data/src/10.3/storage/innobase/row/row0log.cc:3276
      #13 0x00005639db259e2f in commit_try_rebuild (ha_alter_info=0x7f00b426e520, ctx=0x7f006c015fc8, altered_table=0x7f006c152bd0, old_table=0x7f006c14fbd0, trx=0x7f00b65780d8, table_name=0x7f006c14e20d "t2") at /data/src/10.3/storage/innobase/handler/handler0alter.cc:8301
      #14 0x00005639db254a90 in ha_innobase::commit_inplace_alter_table (this=0x7f006c1507f8, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, commit=true) at /data/src/10.3/storage/innobase/handler/handler0alter.cc:9137
      #15 0x00005639daef0c76 in handler::ha_commit_inplace_alter_table (this=0x7f006c1507f8, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, commit=true) at /data/src/10.3/sql/handler.cc:4251
      #16 0x00005639dacf7f58 in mysql_inplace_alter_table (thd=0x7f006c000b00, table_list=0x7f006c014ae8, table=0x7f006c14fbd0, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f00b426e590, alter_ctx=0x7f00b426f150) at /data/src/10.3/sql/sql_table.cc:7229
      #17 0x00005639dacfd23c in mysql_alter_table (thd=0x7f006c000b00, new_db=0x7f006c0150f8 "test", new_name=0x0, create_info=0x7f00b426fd60, table_list=0x7f006c014ae8, alter_info=0x7f00b426fcb0, order_num=0, order=0x0, ignore=false) at /data/src/10.3/sql/sql_table.cc:9205
      #18 0x00005639dad7be7b in Sql_cmd_alter_table::execute (this=0x7f006c015100, thd=0x7f006c000b00) at /data/src/10.3/sql/sql_alter.cc:331
      #19 0x00005639dac2f172 in mysql_execute_command (thd=0x7f006c000b00) at /data/src/10.3/sql/sql_parse.cc:6236
      #20 0x00005639dac33a1c in mysql_parse (thd=0x7f006c000b00, rawbuf=0x7f006c0149f8 "ALTER TABLE t2 ROW_FORMAT=DYNAMIC", length=33, parser_state=0x7f00b4271610, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:7921
      #21 0x00005639dac211bd in dispatch_command (command=COM_QUERY, thd=0x7f006c000b00, packet=0x7f006c08fce1 "ALTER TABLE t2 ROW_FORMAT=DYNAMIC", packet_length=33, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:1819
      #22 0x00005639dac1fc1b in do_command (thd=0x7f006c000b00) at /data/src/10.3/sql/sql_parse.cc:1370
      #23 0x00005639dad769f4 in do_handle_one_connection (connect=0x5639de539bc0) at /data/src/10.3/sql/sql_connect.cc:1418
      #24 0x00005639dad76781 in handle_one_connection (arg=0x5639de539bc0) at /data/src/10.3/sql/sql_connect.cc:1324
      #25 0x00005639db1f0f4c in pfs_spawn_thread (arg=0x5639de5567b0) at /data/src/10.3/storage/perfschema/pfs.cc:1862
      #26 0x00007f00be590494 in start_thread (arg=0x7f00b4272700) at pthread_create.c:333
      #27 0x00007f00bc70893f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      The test case is not deterministic. If it doesn't fail right away, try running with --repeat.

      Also, running with innodb_flush_log_at_trx_commit=2 significantly increased the probability of the race condition in original stress tests while running on disk – once in a few minutes vs once in many hours; and further, running in shm increased the probability even more, to once in a few seconds. Currently the MTR test below fails pretty much every time for me, even on disk and without innodb_flush_log_at_trx_commit=2, so I cannot say whether those options have any effect on it.

      --source include/have_innodb.inc
       
      CREATE TABLE t1 (pk INT PRIMARY KEY, i INT) ENGINE=InnoDB;
      CREATE TABLE t2 (pk INT PRIMARY KEY, i INT) ENGINE=InnoDB;
       
      --connect (con1,localhost,root,,test)
      BEGIN;
      --send
        INSERT INTO t1 VALUES (1,10);
       
      --connection default
      --send
        ALTER TABLE t2 ROW_FORMAT=DYNAMIC;
       
      --connection con1
      --reap
      INSERT INTO t2 VALUES (2,20);
      ROLLBACK;
       
      --connection default
      --reap
       
      # Cleanup
      --disconnect con1
      --connection default
      DROP TABLE t1, t2;
      

      Does not fail on 10.2.
      No visible effect on 10.3 non-debug build.
      The problem rather badly affects upgrade tests on 10.3 (the laste step, when DDL/DML is running on the upgraded server), causes a lot of failures unrelated to upgrade as such.

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Attachment threads [ 44368 ]
            Fix Version/s 10.3 [ 22126 ]
            Description http://buildbot.askmonty.org/buildbot/builders/qa-win-debug/builds/490/steps/crash_tests/logs/stdio
            {noformat}
            Version: '10.3.2-MariaDB-debug-log' socket: '' port: 11500 Source distribution
            Assertion failed: id == 0 || id > trx_id, file d:\qa-win-debug\build\storage\innobase\include\trx0sys.h, line 186
            170915 19:06:50 [ERROR] mysqld got exception 0x80000003 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.

            To report this bug, see https://mariadb.com/kb/en/reporting-bugs

            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed,
            something is definitely wrong and this may fail.

            Server version: 10.3.2-MariaDB-debug-log
            key_buffer_size=1048576
            read_buffer_size=131072
            max_used_connections=7
            max_threads=65537
            thread_count=12
            It is possible that mysqld could use up to
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4242 K bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.

            Thread pointer: 0x7a60a4d268
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            mysqld.exe!my_sigabrt_handler()[my_thr_init.c:488]
            mysqld.exe!raise()[signal.cpp:516]
            mysqld.exe!abort()[abort.cpp:71]
            mysqld.exe!common_assert_to_stderr<wchar_t>()[assert.cpp:149]
            mysqld.exe!_wassert()[assert.cpp:404]
            mysqld.exe!trx_id_check()[trx0sys.h:186]
            mysqld.exe!row_log_table_apply_delete()[row0log.cc:1978]
            mysqld.exe!row_log_table_apply_op()[row0log.cc:2524]
            mysqld.exe!row_log_table_apply_ops()[row0log.cc:3050]
            mysqld.exe!row_log_table_apply()[row0log.cc:3152]
            mysqld.exe!ha_innobase::inplace_alter_table()[handler0alter.cc:6485]
            mysqld.exe!handler::ha_inplace_alter_table()[handler.h:3831]
            mysqld.exe!mysql_inplace_alter_table()[sql_table.cc:7203]
            mysqld.exe!mysql_alter_table()[sql_table.cc:9203]
            mysqld.exe!Sql_cmd_alter_table::execute()[sql_alter.cc:325]
            mysqld.exe!mysql_execute_command()[sql_parse.cc:6235]
            mysqld.exe!mysql_parse()[sql_parse.cc:7921]
            mysqld.exe!dispatch_command()[sql_parse.cc:1821]
            mysqld.exe!do_command()[sql_parse.cc:1369]
            mysqld.exe!threadpool_process_request()[threadpool_common.cc:366]
            mysqld.exe!tp_callback()[threadpool_common.cc:192]
            mysqld.exe!tp_callback()[threadpool_win.cc:378]
            mysqld.exe!work_callback()[threadpool_win.cc:452]
            ntdll.dll!RtlFreeUnicodeString()
            ntdll.dll!RtlFreeUnicodeString()
            KERNEL32.DLL!BaseThreadInitThunk()
            ntdll.dll!RtlUserThreadStart()

            Trying to get some variables.
            Some pointers may be invalid and cause the dump to abort.
            Query (0x7a60ad5940): ALTER TABLE t1 ADD COLUMN extra INT /* QNO 3109 CON_ID 20 */
            Connection ID (thread ID): 20
            Status: NOT_KILLED
            {noformat}

            {noformat}
            E:\buildbot\rqg/runall.pl --no-mask --seed=1505491264 --threads=5 --duration=400 --queries=100M --reporters=QueryTimeout,Backtrace,ErrorLog,Deadlock,Shutdown --redefine=conf/mariadb/redefine_random_keys.yy --redefine=conf/mariadb/redefine_set_session_vars.yy --validators=TransformerNoComparator --transformers=ConvertSubqueriesToViews,DisableOptimizations,EnableOptimizations,ExecuteAsCTE,ExecuteAsInsertSelect,ExecuteAsPreparedOnce,ExecuteAsSelectItem,ExecuteAsUnion,ExecuteAsUpdateDelete,ExecuteAsView,NullIf,OrderBy,StraightJoin,ExecuteAsExecuteImmediate --grammar=conf/runtime/alter_online.yy --gendata=conf/runtime/alter_online.zz --mtr-build-thread=150 --basedir1=D:\qa-win-debug\build --vardir1=E:\buildbot\vardirs\qa-win-debug\10.3-490\optim-crash-tests/current1_1
            {noformat}

            Not reproducible right away.
            {noformat:title=10.3 3722372ae56b7ea90}
            mysqld: /data/src/10.3/storage/innobase/include/trx0sys.h:186: bool trx_id_check(const void*, trx_id_t): Assertion `id == 0 || id > trx_id' failed.
            171024 20:59:53 [ERROR] mysqld got signal 6 ;

            #7 0x00007f00bc64bee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
            #8 0x00005639db364f32 in trx_id_check (db_trx_id=0x7f00b6258082, trx_id=1287) at /data/src/10.3/storage/innobase/include/trx0sys.h:186
            #9 0x00005639db384c68 in row_log_table_apply_delete (thr=0x7f006c14b160, trx_id_col=1, mrec=0x7f00b40df01f "\200", moffsets=0x7f006c17d230, offsets_heap=0x7f006c0645d0, heap=0x7f006c060540, log=0x7f006c040400, save_ext=0x0, ext_size=0) at /data/src/10.3/storage/innobase/row/row0log.cc:2065
            #10 0x00005639db3869b9 in row_log_table_apply_op (thr=0x7f006c14b160, new_trx_id_col=1, dup=0x7f00b426d560, error=0x7f00b426d30c, offsets_heap=0x7f006c0645d0, heap=0x7f006c060540, mrec=0x7f00b40df01f "\200", mrec_end=0x7f00b40df030 "", offsets=0x7f006c17d230) at /data/src/10.3/storage/innobase/row/row0log.cc:2624
            #11 0x00005639db388ca8 in row_log_table_apply_ops (thr=0x7f006c14b160, dup=0x7f00b426d560, stage=0x7f006c17b350) at /data/src/10.3/storage/innobase/row/row0log.cc:3177
            #12 0x00005639db389267 in row_log_table_apply (thr=0x7f006c14b160, old_table=0x7f006c093d88, table=0x7f006c152bd0, stage=0x7f006c17b350) at /data/src/10.3/storage/innobase/row/row0log.cc:3276
            #13 0x00005639db259e2f in commit_try_rebuild (ha_alter_info=0x7f00b426e520, ctx=0x7f006c015fc8, altered_table=0x7f006c152bd0, old_table=0x7f006c14fbd0, trx=0x7f00b65780d8, table_name=0x7f006c14e20d "t2") at /data/src/10.3/storage/innobase/handler/handler0alter.cc:8301
            #14 0x00005639db254a90 in ha_innobase::commit_inplace_alter_table (this=0x7f006c1507f8, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, commit=true) at /data/src/10.3/storage/innobase/handler/handler0alter.cc:9137
            #15 0x00005639daef0c76 in handler::ha_commit_inplace_alter_table (this=0x7f006c1507f8, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, commit=true) at /data/src/10.3/sql/handler.cc:4251
            #16 0x00005639dacf7f58 in mysql_inplace_alter_table (thd=0x7f006c000b00, table_list=0x7f006c014ae8, table=0x7f006c14fbd0, altered_table=0x7f006c152bd0, ha_alter_info=0x7f00b426e520, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f00b426e590, alter_ctx=0x7f00b426f150) at /data/src/10.3/sql/sql_table.cc:7229
            #17 0x00005639dacfd23c in mysql_alter_table (thd=0x7f006c000b00, new_db=0x7f006c0150f8 "test", new_name=0x0, create_info=0x7f00b426fd60, table_list=0x7f006c014ae8, alter_info=0x7f00b426fcb0, order_num=0, order=0x0, ignore=false) at /data/src/10.3/sql/sql_table.cc:9205
            #18 0x00005639dad7be7b in Sql_cmd_alter_table::execute (this=0x7f006c015100, thd=0x7f006c000b00) at /data/src/10.3/sql/sql_alter.cc:331
            #19 0x00005639dac2f172 in mysql_execute_command (thd=0x7f006c000b00) at /data/src/10.3/sql/sql_parse.cc:6236
            #20 0x00005639dac33a1c in mysql_parse (thd=0x7f006c000b00, rawbuf=0x7f006c0149f8 "ALTER TABLE t2 ROW_FORMAT=DYNAMIC", length=33, parser_state=0x7f00b4271610, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:7921
            #21 0x00005639dac211bd in dispatch_command (command=COM_QUERY, thd=0x7f006c000b00, packet=0x7f006c08fce1 "ALTER TABLE t2 ROW_FORMAT=DYNAMIC", packet_length=33, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:1819
            #22 0x00005639dac1fc1b in do_command (thd=0x7f006c000b00) at /data/src/10.3/sql/sql_parse.cc:1370
            #23 0x00005639dad769f4 in do_handle_one_connection (connect=0x5639de539bc0) at /data/src/10.3/sql/sql_connect.cc:1418
            #24 0x00005639dad76781 in handle_one_connection (arg=0x5639de539bc0) at /data/src/10.3/sql/sql_connect.cc:1324
            #25 0x00005639db1f0f4c in pfs_spawn_thread (arg=0x5639de5567b0) at /data/src/10.3/storage/perfschema/pfs.cc:1862
            #26 0x00007f00be590494 in start_thread (arg=0x7f00b4272700) at pthread_create.c:333
            #27 0x00007f00bc70893f in clone () from /lib/x86_64-linux-gnu/libc.so.6
            {noformat}

            The test case is not deterministic. If it doesn't fail right away, try running with {{--repeat}}.

            Also, running with {{innodb_flush_log_at_trx_commit=2}} significantly increased the probability of the race condition in original stress tests while running on disk -- once in a few minutes vs once in many hours; and further, running in shm increased the probability even more, to once in a few seconds. Currently the MTR test below fails pretty much every time for me, even on disk and without {{innodb_flush_log_at_trx_commit=2}}, so I cannot say whether those options have any effect on it.

            {code:sql}
            --source include/have_innodb.inc

            CREATE TABLE t1 (pk INT PRIMARY KEY, i INT) ENGINE=InnoDB;
            CREATE TABLE t2 (pk INT PRIMARY KEY, i INT) ENGINE=InnoDB;

            --connect (con1,localhost,root,,test)
            BEGIN;
            --send
              INSERT INTO t1 VALUES (1,10);

            --connection default
            --send
              ALTER TABLE t2 ROW_FORMAT=DYNAMIC;

            --connection con1
            --reap
            INSERT INTO t2 VALUES (2,20);
            ROLLBACK;

            --connection default
            --reap

            # Cleanup
            --disconnect con1
            --connection default
            DROP TABLE t1, t2;
            {code}

            Does not fail on 10.2.
            No visible effect on 10.3 non-debug build.
            The problem rather badly affects upgrade tests on 10.3 (the laste step, when DDL/DML is running on the upgraded server), causes a lot of failures unrelated to upgrade as such.
            Summary [Draft] Assertion failed: id == 0 || id > trx_id, file d:\qa-win-debug\build\storage\innobase\include\trx0sys.h, line 186 Assertion `id == 0 || id > trx_id' failed in trx_id_check(const void*, trx_id_t)
            elenst Elena Stepanova made changes -
            Assignee Elena Stepanova [ elenst ] Marko Mäkelä [ marko ]
            elenst Elena Stepanova made changes -
            Attachment master.log [ 44212 ]
            elenst Elena Stepanova made changes -
            Comment [ Happened again
            http://buildbot.askmonty.org/buildbot/builders/qa-win-debug/builds/515/steps/crash_tests/logs/stdio
            {noformat}
            E:\buildbot\rqg/runall.pl --no-mask --seed=1505858065 --threads=5 --duration=400 --queries=100M --reporters=QueryTimeout,Backtrace,ErrorLog,Deadlock,Shutdown --redefine=conf/mariadb/redefine_random_keys.yy --redefine=conf/mariadb/redefine_set_session_vars.yy --validators=TransformerNoComparator --transformers=ConvertSubqueriesToViews,DisableOptimizations,EnableOptimizations,ExecuteAsCTE,ExecuteAsInsertSelect,ExecuteAsPreparedOnce,ExecuteAsSelectItem,ExecuteAsUnion,ExecuteAsUpdateDelete,ExecuteAsView,NullIf,OrderBy,StraightJoin,ExecuteAsExecuteImmediate --grammar=conf/runtime/alter_online.yy --gendata=conf/runtime/alter_online.zz --mtr-build-thread=150 --basedir1=D:\qa-win-debug\build --vardir1=E:\buildbot\vardirs\qa-win-debug\10.3-515\optim-crash-tests/current1_1
            {noformat} ]
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            marko Marko Mäkelä made changes -
            Sprint 10.3.3-2 [ 208 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Summary Assertion `id == 0 || id > trx_id' failed in trx_id_check(const void*, trx_id_t) trx_id_check() fails during row_log_table_apply()
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2017-12-07 12:41:17.0 2017-12-07 12:41:17.798
            marko Marko Mäkelä made changes -
            Fix Version/s 10.3.3 [ 22644 ]
            Fix Version/s 10.3 [ 22126 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Sprint 10.3.3-2 [ 208 ]
            serg Sergei Golubchik made changes -
            Rank Ranked higher
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 82610 ] MariaDB v4 [ 152832 ]

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.