Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15533

Assertion `log->blobs' failed in row_log_table_apply_update

Details

    Description

      Apparently belongs to the MDEV-14693 family.

      10.2 a05018977

      mysqld: /data/src/10.2/storage/innobase/row/row0log.cc:1985: dberr_t row_log_table_apply_update(que_thr_t*, ulint, const mrec_t*, const ulint*, mem_heap_t*, mem_heap_t*, row_merge_dup_t*, trx_id_t, const dtuple_t*): Assertion `log->blobs' failed.
      180309 17:43:15 [ERROR] mysqld got signal 6 ;
       
      Query (0x7eff34013618): ALTER TABLE `AA` ADD COLUMN c INTEGER DEFAULT NULL /* QNO 2942 CON_ID 16 */
      Connection ID (thread ID): 16
      Status: KILL_TIMEOUT
      

      #7  0x00007fe3fc442ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #8  0x0000560a3f71a73b in row_log_table_apply_update (thr=0x7fe388685620, new_trx_id_col=1, mrec=0x7fe3e00c7202 "\200", offsets=0x7fe388055920, offsets_heap=0x7fe3884b9940, heap=0x7fe3886dd470, dup=0x7fe3e0ef02e0, trx_id=6660, old_pk=0x7fe3886dd4f8) at /data/src/10.2/storage/innobase/row/row0log.cc:1985
      #9  0x0000560a3f71c061 in row_log_table_apply_op (thr=0x7fe388685620, trx_id_col=1, new_trx_id_col=1, dup=0x7fe3e0ef02e0, error=0x7fe3e0ef0084, offsets_heap=0x7fe3884b9940, heap=0x7fe3886dd470, mrec=0x7fe3e00c7202 "\200", mrec_end=0x7fe3e00d3b9c "", offsets=0x7fe388055920) at /data/src/10.2/storage/innobase/row/row0log.cc:2540
      #10 0x0000560a3f71d4bf in row_log_table_apply_ops (thr=0x7fe388685620, dup=0x7fe3e0ef02e0, stage=0x7fe388565430) at /data/src/10.2/storage/innobase/row/row0log.cc:2909
      #11 0x0000560a3f71d9ca in row_log_table_apply (thr=0x7fe388685620, old_table=0x7fe374189e88, table=0x7fe38805f050, stage=0x7fe388565430) at /data/src/10.2/storage/innobase/row/row0log.cc:3008
      #12 0x0000560a3f6048ae in ha_innobase::inplace_alter_table (this=0x7fe3741b1fa8, altered_table=0x7fe38805f050, ha_alter_info=0x7fe3e0ef0610) at /data/src/10.2/storage/innobase/handler/handler0alter.cc:6488
      #13 0x0000560a3f11a8f7 in handler::ha_inplace_alter_table (this=0x7fe3741b1fa8, altered_table=0x7fe38805f050, ha_alter_info=0x7fe3e0ef0610) at /data/src/10.2/sql/handler.h:3775
      #14 0x0000560a3f1117b1 in mysql_inplace_alter_table (thd=0x7fe388000ba0, table_list=0x7fe388013768, table=0x7fe3741b13a0, altered_table=0x7fe38805f050, ha_alter_info=0x7fe3e0ef0610, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7fe3e0ef0680, alter_ctx=0x7fe3e0ef1240) at /data/src/10.2/sql/sql_table.cc:7332
      #15 0x0000560a3f116de4 in mysql_alter_table (thd=0x7fe388000ba0, new_db=0x7fe388013d78 "test", new_name=0x0, create_info=0x7fe3e0ef1e50, table_list=0x7fe388013768, alter_info=0x7fe3e0ef1da0, order_num=0, order=0x0, ignore=false) at /data/src/10.2/sql/sql_table.cc:9353
      #16 0x0000560a3f190954 in Sql_cmd_alter_table::execute (this=0x7fe388013f20, thd=0x7fe388000ba0) at /data/src/10.2/sql/sql_alter.cc:324
      #17 0x0000560a3f04bc57 in mysql_execute_command (thd=0x7fe388000ba0) at /data/src/10.2/sql/sql_parse.cc:6208
      #18 0x0000560a3f0505d1 in mysql_parse (thd=0x7fe388000ba0, rawbuf=0x7fe388013618 "ALTER TABLE `E` ADD COLUMN m INTEGER DEFAULT NULL /* QNO 3376 CON_ID 16 */", length=74, parser_state=0x7fe3e0ef3200, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:7902
      #19 0x0000560a3f03e4f1 in dispatch_command (command=COM_QUERY, thd=0x7fe388000ba0, packet=0x7fe38807ed41 "ALTER TABLE `E` ADD COLUMN m INTEGER DEFAULT NULL /* QNO 3376 CON_ID 16 */ ", packet_length=75, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:1806
      #20 0x0000560a3f03ce54 in do_command (thd=0x7fe388000ba0) at /data/src/10.2/sql/sql_parse.cc:1360
      #21 0x0000560a3f18b646 in do_handle_one_connection (connect=0x560a41e93610) at /data/src/10.2/sql/sql_connect.cc:1335
      #22 0x0000560a3f18b3d3 in handle_one_connection (arg=0x560a41e93610) at /data/src/10.2/sql/sql_connect.cc:1241
      #23 0x0000560a3f5ab1f8 in pfs_spawn_thread (arg=0x560a41f35e50) at /data/src/10.2/storage/perfschema/pfs.cc:1862
      #24 0x00007fe3fe119494 in start_thread (arg=0x7fe3e0ef4700) at pthread_create.c:333
      #25 0x00007fe3fc4ff93f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      Getting it periodically on 10.2 while running

      experimental f6acdbc55e206

      perl ./runall-trials.pl --trials=20 --duration=350 --threads=6 --seed=1520525874 --reporters=Backtrace,ErrorLog,Deadlock --validators=TransformerNoComparator --transformers=ExecuteAsCTE,ExecuteAsDeleteReturning,ExecuteAsExecuteImmediate,ExecuteAsInsertSelect,ExecuteAsUnion,ExecuteAsUpdateDelete,ExecuteAsView,ExecuteAsPreparedTwice,ExecuteAsSPTwice --redefine=conf/mariadb/general-workarounds.yy --mysqld=--log_output=FILE --views --vcols --redefine=conf/mariadb/alter_table.yy --redefine=conf/mariadb/bulk_insert.yy --redefine=conf/mariadb/xa.yy --basedir=/data/bld/10.2 --mysqld=--log_bin_trust_function_creators=1 --mysqld=--log-bin --mysqld=--max-statement-time=30 --mysqld=--loose-debug_assert_on_not_freed_memory=0 --grammar=conf/runtime/performance_schema.yy --mysqld=--performance-schema --engine=InnoDB --vardir=/data/logs/travis3 --force
      

      No reliable information about other version. Most likely isn't worth digging, because root cause is likely to be the same as for the rest of MDEV-14693 family.

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova made changes -
            elenst Elena Stepanova made changes -
            Attachment threads [ 45338 ]
            elenst Elena Stepanova added a comment - New occurrence on 10.3: https://api.travis-ci.org/v3/job/361026822/log.txt

            Problem (same assert + also an ALTER TABLE t1 ADD COLUMN) replayed on
            10.4.0-MariaDB-debug-log
            HEAD -> 10.4, origin/bb-10.4-bar, origin/10.4 5646c4315918c1e4127bfb68b70d1c5333bfc2c7 2018-10-09T12:02:35+04:00
            too.

            perl /mnt/r0/mleich/RQG_new/rqg.pl --grammar=conf/mariadb/table_stress.yy --gendata=conf/mariadb/table_stress.zz --gendata_sql=conf/mariadb/table_stress.sql --mysqld=--innodb_lock_schedule_algorithm=fcfs --mysqld=--innodb_adaptive_hash_index=OFF --mysqld=--loose-idle_write_transaction_timeout=0 --mysqld=--loose-idle_transaction_timeout=0 --mysqld=--loose-idle_readonly_transaction_timeout=0 --mysqld=--connect_timeout=60 --mysqld=--interactive_timeout=28800 --mysqld=--slave_net_timeout=60 --mysqld=--net_read_timeout=30 --mysqld=--net_write_timeout=60 --mysqld=--loose-table_lock_wait_timeout=50 --mysqld=--wait_timeout=28800 --mysqld=--lock-wait-timeout=86400 --mysqld=--innodb-lock-wait-timeout=50 --no-mask --queries=10M --duration=300 --seed=6 --views --validators=none --redefine=conf/mariadb/general-workarounds.yy --redefine=conf/mariadb/alter_table.yy --redefine=conf/mariadb/bulk_insert.yy --redefine=conf/mariadb/xa.yy --redefine=conf/mariadb/versioning.yy --redefine=conf/mariadb/sequences.yy --mysqld=--log_output=FILE --mysqld=--log-bin --mysqld=--log_bin_trust_function_creators=1 --mysqld=--loose-max-statement-time=30 --mysqld=--loose-debug_assert_on_not_freed_memory=0 --reporters=Backtrace,ErrorLog,Deadlock --engine=InnoDB --mysqld=--innodb-buffer-pool-size=256M --mysqld=--innodb_use_native_aio=0--seed=random --workdir=/mnt/r0/mleich/RQG_new/storage/1539097282/64 --vardir=/dev/shm/vardir/1539097282/64 --mtr-build-thread=1056 --basedir1=/work_m/10.4/bld_asan/ --no-mask
            

            mleich Matthias Leich added a comment - Problem (same assert + also an ALTER TABLE t1 ADD COLUMN) replayed on 10.4.0-MariaDB-debug-log HEAD -> 10.4, origin/bb-10.4-bar, origin/10.4 5646c4315918c1e4127bfb68b70d1c5333bfc2c7 2018-10-09T12:02:35+04:00 too. perl /mnt/r0/mleich/RQG_new/rqg.pl --grammar=conf/mariadb/table_stress.yy --gendata=conf/mariadb/table_stress.zz --gendata_sql=conf/mariadb/table_stress.sql --mysqld=--innodb_lock_schedule_algorithm=fcfs --mysqld=--innodb_adaptive_hash_index=OFF --mysqld=--loose-idle_write_transaction_timeout=0 --mysqld=--loose-idle_transaction_timeout=0 --mysqld=--loose-idle_readonly_transaction_timeout=0 --mysqld=--connect_timeout=60 --mysqld=--interactive_timeout=28800 --mysqld=--slave_net_timeout=60 --mysqld=--net_read_timeout=30 --mysqld=--net_write_timeout=60 --mysqld=--loose-table_lock_wait_timeout=50 --mysqld=--wait_timeout=28800 --mysqld=--lock-wait-timeout=86400 --mysqld=--innodb-lock-wait-timeout=50 --no-mask --queries=10M --duration=300 --seed=6 --views --validators=none --redefine=conf/mariadb/general-workarounds.yy --redefine=conf/mariadb/alter_table.yy --redefine=conf/mariadb/bulk_insert.yy --redefine=conf/mariadb/xa.yy --redefine=conf/mariadb/versioning.yy --redefine=conf/mariadb/sequences.yy --mysqld=--log_output=FILE --mysqld=--log-bin --mysqld=--log_bin_trust_function_creators=1 --mysqld=--loose-max-statement-time=30 --mysqld=--loose-debug_assert_on_not_freed_memory=0 --reporters=Backtrace,ErrorLog,Deadlock --engine=InnoDB --mysqld=--innodb-buffer-pool-size=256M --mysqld=--innodb_use_native_aio=0--seed=random --workdir=/mnt/r0/mleich/RQG_new/storage/1539097282/64 --vardir=/dev/shm/vardir/1539097282/64 --mtr-build-thread=1056 --basedir1=/work_m/10.4/bld_asan/ --no-mask
            mleich Matthias Leich added a comment - - edited

            I am just during grammar simplification.
            The uploaded grammar MDEV-15533.yy is some intermediate result of that.
            Up till today the likelihood to replay the problem is rather low (< 1% of all RQG runs).
             
            Command line (certain values need to be adjusted to local conditions):
            perl rqg.pl --duration=350 --queries=1000000 --threads=10 --no_mask  \
            --gendata=conf/mariadb/table_stress.zz --engine=InnoDB \
            --gendata_sql=conf/mariadb/table_stress.sql --mysqld=--net_read_timeout=30 \
            --mysqld=--slave_net_timeout=60 --mysqld=--transaction-isolation=READ-UNCOMMITTED \
            --mysqld=--loose_innodb_use_native_aio=0 --mysqld=--innodb-lock-wait-timeout=50 \
            --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
            --mysqld=--loose-idle_readonly_transaction_timeout=0 --mysqld=--log-output=none \
            --mysqld=--loose-idle_transaction_timeout=0 --mysqld=--wait_timeout=28800 \
            --mysqld=--loose-idle_write_transaction_timeout=0 --mysqld=--interactive_timeout=28800 \
            --mysqld=--innodb_stats_persistent=off --mysqld=--innodb-flush-log-at-trx-commit=0 \
            --mysqld=--net_write_timeout=60 --mysqld=--lock-wait-timeout=86400 \
            --mysqld=--loose-table_lock_wait_timeout=50 --mysqld=--connect_timeout=60 \
            --mysqld=--innodb_change_buffering_debug=1 --whitelist_statuses="'STATUS_SERVER_CRASHED' \
            --whitelist_patterns='mysqld: .{1,100}row0log.cc.{1,100}dberr_t row_log_table_apply_update.{1,200} Assertion .log->blobs. failed','mysqld: .{1,100}trx0rec.ic:.{1,100}trx_undo_rec_copy.{1,200}: Assertion .len < srv_page_size. failed.','InnoDB: Failing assertion: \(block\)->index \|\| my_atomic_loadlint\(\&\(block\)->n_pointers\) == 0','mysqld: .{1,150}lock0lock.cc:.{1,20} my_bool lock_rec_other_trx_holds_expl_callback.{1,200}: Assertion.{1,150}expl_lock->trx == arg->impl_trx. failed'" \
             --blacklist_statuses='STATUS_OK' \
            --blacklist_patterns="'Possible endless loop in grammar.','\[ERROR\] InnoDB: preallocating .{1,120} with error 28','  is truncated: expected core file size ','1034 Create index by sort failed','mysqld: .{1,100}rem0rec.cc.{1,100}rec_get_converted_size_comp_prefix_low.{1,200} Assertion ..col->mbmaxlen .{1,100}col->mbmaxlen. failed.'" \
             --reporters=Backtrace,ErrorLog --validators=None \
             --grammar=MDEV15533.yy \
            --workdir=/home/mleich/work/RQG_mleich1/storage/1554152142/34 \
            --vardir=/dev/shm/vardir/1554152142/34 \
            --mtr-build-thread=996 \
            --basedir1=/work/10.3/bld_debug --seed=6497
            
            

            mleich Matthias Leich added a comment - - edited I am just during grammar simplification. The uploaded grammar MDEV-15533.yy is some intermediate result of that. Up till today the likelihood to replay the problem is rather low (< 1% of all RQG runs).   Command line (certain values need to be adjusted to local conditions): perl rqg.pl --duration=350 --queries=1000000 --threads=10 --no_mask \ --gendata=conf/mariadb/table_stress.zz --engine=InnoDB \ --gendata_sql=conf/mariadb/table_stress.sql --mysqld=--net_read_timeout=30 \ --mysqld=--slave_net_timeout=60 --mysqld=--transaction-isolation=READ-UNCOMMITTED \ --mysqld=--loose_innodb_use_native_aio=0 --mysqld=--innodb-lock-wait-timeout=50 \ --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \ --mysqld=--loose-idle_readonly_transaction_timeout=0 --mysqld=--log-output=none \ --mysqld=--loose-idle_transaction_timeout=0 --mysqld=--wait_timeout=28800 \ --mysqld=--loose-idle_write_transaction_timeout=0 --mysqld=--interactive_timeout=28800 \ --mysqld=--innodb_stats_persistent=off --mysqld=--innodb-flush-log-at-trx-commit=0 \ --mysqld=--net_write_timeout=60 --mysqld=--lock-wait-timeout=86400 \ --mysqld=--loose-table_lock_wait_timeout=50 --mysqld=--connect_timeout=60 \ --mysqld=--innodb_change_buffering_debug=1 --whitelist_statuses="'STATUS_SERVER_CRASHED' \ --whitelist_patterns='mysqld: .{1,100}row0log.cc.{1,100}dberr_t row_log_table_apply_update.{1,200} Assertion .log->blobs. failed','mysqld: .{1,100}trx0rec.ic:.{1,100}trx_undo_rec_copy.{1,200}: Assertion .len < srv_page_size. failed.','InnoDB: Failing assertion: \(block\)->index \|\| my_atomic_loadlint\(\&\(block\)->n_pointers\) == 0','mysqld: .{1,150}lock0lock.cc:.{1,20} my_bool lock_rec_other_trx_holds_expl_callback.{1,200}: Assertion.{1,150}expl_lock->trx == arg->impl_trx. failed'" \ --blacklist_statuses='STATUS_OK' \ --blacklist_patterns="'Possible endless loop in grammar.','\[ERROR\] InnoDB: preallocating .{1,120} with error 28',' is truncated: expected core file size ','1034 Create index by sort failed','mysqld: .{1,100}rem0rec.cc.{1,100}rec_get_converted_size_comp_prefix_low.{1,200} Assertion ..col->mbmaxlen .{1,100}col->mbmaxlen. failed.'" \ --reporters=Backtrace,ErrorLog --validators=None \ --grammar=MDEV15533.yy \ --workdir=/home/mleich/work/RQG_mleich1/storage/1554152142/34 \ --vardir=/dev/shm/vardir/1554152142/34 \ --mtr-build-thread=996 \ --basedir1=/work/10.3/bld_debug --seed=6497
            mleich Matthias Leich made changes -
            Attachment MDEV-15533.yy [ 47893 ]
            alice Alice Sherepa added a comment -

            still reproducible. on 10.5:

            mysqld: /10.5/storage/innobase/row/row0log.cc:2121: dberr_t row_log_table_apply_update(que_thr_t*, ulint, const mrec_t*, const offset_t*, mem_heap_t*, mem_heap_t*, row_merge_dup_t*, const dtuple_t*): Assertion `log->blobs' failed.
            200211  2:04:15 [ERROR] mysqld got signal 6 ;
             
            row/row0log.cc:2123(row_log_table_apply_update(que_thr_t*, unsigned long, unsigned char const*, unsigned short const*, mem_block_info_t*, mem_block_info_t*, row_merge_dup_t*, dtuple_t const*))[0x20e9f35]
            row/row0log.cc:2664(row_log_table_apply_op(que_thr_t*, unsigned long, row_merge_dup_t*, dberr_t*, mem_block_info_t*, mem_block_info_t*, unsigned char const*, unsigned char const*, unsigned short*))[0x20ede07]
            row/row0log.cc:3037(row_log_table_apply_ops(que_thr_t*, row_merge_dup_t*, ut_stage_alter_t*))[0x20eeb7e]
            row/row0log.cc:3143(row_log_table_apply(que_thr_t*, dict_table_t*, TABLE*, ut_stage_alter_t*, dict_table_t*))[0x1e5a800]
            handler/handler0alter.cc:10539(alter_rebuild_apply_log(ha_innobase_inplace_ctx*, Alter_inplace_info*, TABLE*))[0x1e5c804]
            handler/handler0alter.cc:10739(ha_innobase::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x1178ba8]
            sql/handler.cc:4580(handler::ha_commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x19a7368]
            sql/ha_partition.cc:10251(ha_partition::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x1178ba8]
            sql/handler.cc:4580(handler::ha_commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0xc4a410]
            sql/sql_table.cc:7785(mysql_inplace_alter_table(THD*, TABLE_LIST*, TABLE*, TABLE*, Alter_inplace_info*, enum_alter_inplace_result, MDL_request*, Alter_table_ctx*))[0xc58a83]
            sql/sql_table.cc:10129(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool))[0xdcc8ab]
            sql/sql_alter.cc:523(Sql_cmd_alter_table::execute(THD*))[0xa11186]
            sql/sql_parse.cc:5962(mysql_execute_command(THD*))[0xa1d5f4]
            sql/sql_parse.cc:7991(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x9f598e]
            sql/sql_parse.cc:1848(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x9f2848]
            sql/sql_parse.cc:1364(do_command(THD*))[0xdb5999]
            sql/sql_connect.cc:1422(do_handle_one_connection(CONNECT*, bool))[0xdb537c]
            /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f44c313e6ba]
            /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f44c23cf41d]
            

            alice Alice Sherepa added a comment - still reproducible. on 10.5: mysqld: /10.5/storage/innobase/row/row0log.cc:2121: dberr_t row_log_table_apply_update(que_thr_t*, ulint, const mrec_t*, const offset_t*, mem_heap_t*, mem_heap_t*, row_merge_dup_t*, const dtuple_t*): Assertion `log->blobs' failed. 200211 2:04:15 [ERROR] mysqld got signal 6 ;   row/row0log.cc:2123(row_log_table_apply_update(que_thr_t*, unsigned long, unsigned char const*, unsigned short const*, mem_block_info_t*, mem_block_info_t*, row_merge_dup_t*, dtuple_t const*))[0x20e9f35] row/row0log.cc:2664(row_log_table_apply_op(que_thr_t*, unsigned long, row_merge_dup_t*, dberr_t*, mem_block_info_t*, mem_block_info_t*, unsigned char const*, unsigned char const*, unsigned short*))[0x20ede07] row/row0log.cc:3037(row_log_table_apply_ops(que_thr_t*, row_merge_dup_t*, ut_stage_alter_t*))[0x20eeb7e] row/row0log.cc:3143(row_log_table_apply(que_thr_t*, dict_table_t*, TABLE*, ut_stage_alter_t*, dict_table_t*))[0x1e5a800] handler/handler0alter.cc:10539(alter_rebuild_apply_log(ha_innobase_inplace_ctx*, Alter_inplace_info*, TABLE*))[0x1e5c804] handler/handler0alter.cc:10739(ha_innobase::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x1178ba8] sql/handler.cc:4580(handler::ha_commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x19a7368] sql/ha_partition.cc:10251(ha_partition::commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0x1178ba8] sql/handler.cc:4580(handler::ha_commit_inplace_alter_table(TABLE*, Alter_inplace_info*, bool))[0xc4a410] sql/sql_table.cc:7785(mysql_inplace_alter_table(THD*, TABLE_LIST*, TABLE*, TABLE*, Alter_inplace_info*, enum_alter_inplace_result, MDL_request*, Alter_table_ctx*))[0xc58a83] sql/sql_table.cc:10129(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool))[0xdcc8ab] sql/sql_alter.cc:523(Sql_cmd_alter_table::execute(THD*))[0xa11186] sql/sql_parse.cc:5962(mysql_execute_command(THD*))[0xa1d5f4] sql/sql_parse.cc:7991(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x9f598e] sql/sql_parse.cc:1848(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x9f2848] sql/sql_parse.cc:1364(do_command(THD*))[0xdb5999] sql/sql_connect.cc:1422(do_handle_one_connection(CONNECT*, bool))[0xdb537c] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f44c313e6ba] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f44c23cf41d]
            alice Alice Sherepa made changes -
            Affects Version/s 10.3 [ 22126 ]
            Affects Version/s 10.4 [ 22408 ]
            alice Alice Sherepa made changes -
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            mleich Matthias Leich made changes -
            Component/s XA [ 12400 ]
            Fix Version/s 10.5 [ 23123 ]
            Affects Version/s 10.5.5 [ 24423 ]
            Assignee Elena Stepanova [ elenst ] Marko Mäkelä [ marko ]
            Labels rr-profile
            Summary [Draft] XA: Assertion `log->blobs' failed in row_log_table_apply_update Assertion `log->blobs' failed in row_log_table_apply_update

            mleich, the originally reported assertion failure for 10.2 is something else, which could also be possible in 10.0 and 10.1:

            	if (page_rec_is_infimum(btr_pcur_get_rec(&pcur))
            	    || btr_pcur_get_low_match(&pcur) < index->n_uniq) {
            		/* The record was not found. This should only happen
            		when an earlier ROW_T_INSERT or ROW_T_UPDATE was
            		diverted because BLOBs were freed when the insert was
            		later rolled back. */
             
            		ut_ad(log->blobs);
            

            I filed MDEV-23244 for the failure that you reproduced. It is specific to 10.3 and later.

            marko Marko Mäkelä added a comment - mleich , the originally reported assertion failure for 10.2 is something else, which could also be possible in 10.0 and 10.1: if (page_rec_is_infimum(btr_pcur_get_rec(&pcur)) || btr_pcur_get_low_match(&pcur) < index->n_uniq) { /* The record was not found. This should only happen when an earlier ROW_T_INSERT or ROW_T_UPDATE was diverted because BLOBs were freed when the insert was later rolled back. */   ut_ad( log ->blobs); I filed MDEV-23244 for the failure that you reproduced. It is specific to 10.3 and later.
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Labels rr-profile rr-profile-analyzed
            marko Marko Mäkelä made changes -
            Labels rr-profile-analyzed need_rr rr-profile-analyzed
            elenst Elena Stepanova added a comment - - edited

            The initial failure was most certainly XA-specific. It was one of a bunch of problems revealed by enabling (usually disabled) XA flow in concurrent tests. MDEV-15518, MDEV-15490, MDEV-15532 and probably some more were a part of the same exercise. Some of them are now closed as "not reproducible / probably fixed in 10.2", I doubt it is accurate, not much was happening in 10.2 in this regard. But anyway, around the same time when this item, MDEV-15533, was filed, Marko's analysis in MDEV-14693 confirmed that the failures are a side-effect of the broken XA, which is why this issue was never actively pursued.
            It is not very difficult to create a non-deterministic XA-based test case for it (and generate an rr profile), it's just rather pointless since there is little to no chance anything can be done about it.

            Still, here is, for example, a test case which fails (non-deterministically) on current 10.1 (run with --repeat and on disk rather than in shm):

            --source include/have_innodb.inc
            --source include/have_sequence.inc
             
            CREATE TABLE t1 (a INT, b INT) ENGINE=InnoDB;
            CREATE VIEW v1 AS SELECT * FROM t1;
            INSERT INTO t1 (a) SELECT seq FROM seq_1_to_1000;
             
            --connect (con1,localhost,root,,test)
            XA START 'x';
            INSERT INTO t1 (a) VALUES (0);
            --error ER_XAER_RMFAIL
            FLUSH HOSTS;
             
            --connection default
            SET lock_wait_timeout= 1;
            --send
              ALTER TABLE t1 DROP b;
             
            --connection con1
            --error ER_WRONG_VALUE_FOR_VAR
            SET sql_mode= NULL;
            UPDATE v1 SET a = 0;
             
            --connection default
            --error ER_LOCK_WAIT_TIMEOUT
            --reap
            --disconnect con1
            DROP VIEW v1;
            DROP TABLE t1;
            

            An interesting part about this test case is that while it fails on 10.1 on the assertion and passes on 10.2-10.3 on whatever reason, it causes a strange effect on 10.4-10.5: there UPDATE stalls for innodb_lock_wait_timeout seconds (e.g. for 50 seconds if you run the test case with default settings). After 50 seconds, it succeeds and proceeds. I don't see right away from the test case logic why it would do so.

            elenst Elena Stepanova added a comment - - edited The initial failure was most certainly XA-specific. It was one of a bunch of problems revealed by enabling (usually disabled) XA flow in concurrent tests. MDEV-15518 , MDEV-15490 , MDEV-15532 and probably some more were a part of the same exercise. Some of them are now closed as "not reproducible / probably fixed in 10.2", I doubt it is accurate, not much was happening in 10.2 in this regard. But anyway, around the same time when this item, MDEV-15533 , was filed, Marko's analysis in MDEV-14693 confirmed that the failures are a side-effect of the broken XA, which is why this issue was never actively pursued. It is not very difficult to create a non-deterministic XA-based test case for it (and generate an rr profile), it's just rather pointless since there is little to no chance anything can be done about it. Still, here is, for example, a test case which fails (non-deterministically) on current 10.1 (run with --repeat and on disk rather than in shm): --source include/have_innodb.inc --source include/have_sequence.inc   CREATE TABLE t1 (a INT , b INT ) ENGINE=InnoDB; CREATE VIEW v1 AS SELECT * FROM t1; INSERT INTO t1 (a) SELECT seq FROM seq_1_to_1000;   --connect (con1,localhost,root,,test) XA START 'x' ; INSERT INTO t1 (a) VALUES (0); --error ER_XAER_RMFAIL FLUSH HOSTS;   --connection default SET lock_wait_timeout= 1; --send ALTER TABLE t1 DROP b;   --connection con1 --error ER_WRONG_VALUE_FOR_VAR SET sql_mode= NULL ; UPDATE v1 SET a = 0;   --connection default --error ER_LOCK_WAIT_TIMEOUT --reap --disconnect con1 DROP VIEW v1; DROP TABLE t1; An interesting part about this test case is that while it fails on 10.1 on the assertion and passes on 10.2-10.3 on whatever reason, it causes a strange effect on 10.4-10.5: there UPDATE stalls for innodb_lock_wait_timeout seconds (e.g. for 50 seconds if you run the test case with default settings). After 50 seconds, it succeeds and proceeds. I don't see right away from the test case logic why it would do so.
            elenst Elena Stepanova made changes -
            Component/s XA [ 12400 ]

            And here is a test case which seems to be failing equally well on all of current 10.1-10.5 (but it's still non-deterministic, so run with --repeat=N if it doesn't fail right away):

            --source include/have_innodb.inc
             
            CREATE TABLE t1 (a INT) ENGINE=InnoDB;
            XA BEGIN 'x';
            INSERT IGNORE INTO t1 VALUES (1),(2);
             
            --connect (con1,localhost,root,,test)
            SET lock_wait_timeout= 1;
            --send
              ALTER TABLE t1 FORCE;
             
            --connection default
            --error ER_NO_SUCH_TABLE
            UPDATE IGNORE xx SET x = 0;
            --error ER_XAER_RMFAIL
            FLUSH TABLES;
            UPDATE t1 SET a = 3;
            XA END 'x';
            XA ROLLBACK 'x';
             
            --connection con1
            --reap
             
            # Cleanup
            --disconnect con1
            --connection default
            DROP TABLE t1;
            

            10.5 74491955

            mariadbd: /data/src/10.5/storage/innobase/row/row0log.cc:2127: dberr_t row_log_table_apply_update(que_thr_t*, ulint, const mrec_t*, const rec_offs*, mem_heap_t*, mem_heap_t*, row_merge_dup_t*, const dtuple_t*): Assertion `log->blobs' failed.
            200725 16:08:18 [ERROR] mysqld got signal 6 ;
             
            #7  0x00007fa0af769f12 in __GI___assert_fail (assertion=0x55c7657c992a "log->blobs", file=0x55c7657c8cf8 "/data/src/10.5/storage/innobase/row/row0log.cc", line=2127, function=0x55c7657ccca0 <row_log_table_apply_update(que_thr_t*, unsigned long, unsigned char const*, unsigned short const*, mem_block_info_t*, mem_block_info_t*, row_merge_dup_t*, dtuple_t const*)::__PRETTY_FUNCTION__> "dberr_t row_log_table_apply_update(que_thr_t*, ulint, const mrec_t*, const rec_offs*, mem_heap_t*, mem_heap_t*, row_merge_dup_t*, const dtuple_t*)") at assert.c:101
            #8  0x000055c76501de78 in row_log_table_apply_update (thr=0x7fa07401d658, new_trx_id_col=1, mrec=0x7fa0a8080003 "", offsets=0x7fa07400ca38, offsets_heap=0x7fa074078608, heap=0x7fa074074568, dup=0x7fa0a8c9f490, old_pk=0x7fa0740745e8) at /data/src/10.5/storage/innobase/row/row0log.cc:2127
            #9  0x000055c76501fff3 in row_log_table_apply_op (thr=0x7fa07401d658, new_trx_id_col=1, dup=0x7fa0a8c9f490, error=0x7fa0a8c9f20c, offsets_heap=0x7fa074078608, heap=0x7fa074074568, mrec=0x7fa0a8080003 "", mrec_end=0x7fa0a8080094 "", offsets=0x7fa07400ca38) at /data/src/10.5/storage/innobase/row/row0log.cc:2670
            #10 0x000055c765022233 in row_log_table_apply_ops (thr=0x7fa07401d658, dup=0x7fa0a8c9f490, stage=0x7fa074020818) at /data/src/10.5/storage/innobase/row/row0log.cc:3043
            #11 0x000055c765022988 in row_log_table_apply (thr=0x7fa07401d658, old_table=0x7fa07021b7d8, table=0x7fa0a8ca0910, stage=0x7fa074020818, new_table=0x7fa07401df18) at /data/src/10.5/storage/innobase/row/row0log.cc:3149
            #12 0x000055c764ea60c6 in alter_rebuild_apply_log (ctx=0x7fa0740137f0, ha_alter_info=0x7fa0a8ca0870, altered_table=0x7fa0a8ca0910) at /data/src/10.5/storage/innobase/handler/handler0alter.cc:10633
            #13 0x000055c764ea6fc1 in ha_innobase::commit_inplace_alter_table (this=0x7fa0700faa00, altered_table=0x7fa0a8ca0910, ha_alter_info=0x7fa0a8ca0870, commit=true) at /data/src/10.5/storage/innobase/handler/handler0alter.cc:10832
            #14 0x000055c7649fadf9 in handler::ha_commit_inplace_alter_table (this=0x7fa0700faa00, altered_table=0x7fa0a8ca0910, ha_alter_info=0x7fa0a8ca0870, commit=true) at /data/src/10.5/sql/handler.cc:4843
            #15 0x000055c7647918ae in mysql_inplace_alter_table (thd=0x7fa074000b18, table_list=0x7fa074012548, table=0x7fa07021b248, altered_table=0x7fa0a8ca0910, ha_alter_info=0x7fa0a8ca0870, inplace_supported=HA_ALTER_INPLACE_COPY_NO_LOCK, target_mdl_request=0x7fa0a8ca0ce0, alter_ctx=0x7fa0a8ca1830) at /data/src/10.5/sql/sql_table.cc:8038
            #16 0x000055c7647990d9 in mysql_alter_table (thd=0x7fa074000b18, new_db=0x7fa0740053d8, new_name=0x7fa0740057e0, create_info=0x7fa0a8ca2430, table_list=0x7fa074012548, alter_info=0x7fa0a8ca2360, order_num=0, order=0x0, ignore=false, if_exists=false) at /data/src/10.5/sql/sql_table.cc:10564
            #17 0x000055c76483e97a in Sql_cmd_alter_table::execute (this=0x7fa074012c20, thd=0x7fa074000b18) at /data/src/10.5/sql/sql_alter.cc:538
            #18 0x000055c76469da0c in mysql_execute_command (thd=0x7fa074000b18) at /data/src/10.5/sql/sql_parse.cc:5951
            #19 0x000055c7646a3d4d in mysql_parse (thd=0x7fa074000b18, rawbuf=0x7fa074012470 "ALTER TABLE t1 FORCE", length=20, parser_state=0x7fa0a8ca3520, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:7993
            #20 0x000055c764690046 in dispatch_command (command=COM_QUERY, thd=0x7fa074000b18, packet=0x7fa0740088b9 "ALTER TABLE t1 FORCE", packet_length=20, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:1867
            #21 0x000055c76468e77e in do_command (thd=0x7fa074000b18) at /data/src/10.5/sql/sql_parse.cc:1348
            #22 0x000055c76483407a in do_handle_one_connection (connect=0x55c7684c5278, put_in_cache=true) at /data/src/10.5/sql/sql_connect.cc:1410
            #23 0x000055c764833de2 in handle_one_connection (arg=0x55c7684c5278) at /data/src/10.5/sql/sql_connect.cc:1312
            #24 0x000055c764d72126 in pfs_spawn_thread (arg=0x55c7684c4ea8) at /data/src/10.5/storage/perfschema/pfs.cc:2201
            #25 0x00007fa0b16f24a4 in start_thread (arg=0x7fa0a8ca4700) at pthread_create.c:456
            #26 0x00007fa0af826d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
            

            elenst Elena Stepanova added a comment - And here is a test case which seems to be failing equally well on all of current 10.1-10.5 (but it's still non-deterministic, so run with --repeat=N if it doesn't fail right away): --source include/have_innodb.inc   CREATE TABLE t1 (a INT ) ENGINE=InnoDB; XA BEGIN 'x' ; INSERT IGNORE INTO t1 VALUES (1),(2);   --connect (con1,localhost,root,,test) SET lock_wait_timeout= 1; --send ALTER TABLE t1 FORCE ;   --connection default --error ER_NO_SUCH_TABLE UPDATE IGNORE xx SET x = 0; --error ER_XAER_RMFAIL FLUSH TABLES; UPDATE t1 SET a = 3; XA END 'x' ; XA ROLLBACK 'x' ;   --connection con1 --reap   # Cleanup --disconnect con1 --connection default DROP TABLE t1; 10.5 74491955 mariadbd: /data/src/10.5/storage/innobase/row/row0log.cc:2127: dberr_t row_log_table_apply_update(que_thr_t*, ulint, const mrec_t*, const rec_offs*, mem_heap_t*, mem_heap_t*, row_merge_dup_t*, const dtuple_t*): Assertion `log->blobs' failed. 200725 16:08:18 [ERROR] mysqld got signal 6 ;   #7 0x00007fa0af769f12 in __GI___assert_fail (assertion=0x55c7657c992a "log->blobs", file=0x55c7657c8cf8 "/data/src/10.5/storage/innobase/row/row0log.cc", line=2127, function=0x55c7657ccca0 <row_log_table_apply_update(que_thr_t*, unsigned long, unsigned char const*, unsigned short const*, mem_block_info_t*, mem_block_info_t*, row_merge_dup_t*, dtuple_t const*)::__PRETTY_FUNCTION__> "dberr_t row_log_table_apply_update(que_thr_t*, ulint, const mrec_t*, const rec_offs*, mem_heap_t*, mem_heap_t*, row_merge_dup_t*, const dtuple_t*)") at assert.c:101 #8 0x000055c76501de78 in row_log_table_apply_update (thr=0x7fa07401d658, new_trx_id_col=1, mrec=0x7fa0a8080003 "", offsets=0x7fa07400ca38, offsets_heap=0x7fa074078608, heap=0x7fa074074568, dup=0x7fa0a8c9f490, old_pk=0x7fa0740745e8) at /data/src/10.5/storage/innobase/row/row0log.cc:2127 #9 0x000055c76501fff3 in row_log_table_apply_op (thr=0x7fa07401d658, new_trx_id_col=1, dup=0x7fa0a8c9f490, error=0x7fa0a8c9f20c, offsets_heap=0x7fa074078608, heap=0x7fa074074568, mrec=0x7fa0a8080003 "", mrec_end=0x7fa0a8080094 "", offsets=0x7fa07400ca38) at /data/src/10.5/storage/innobase/row/row0log.cc:2670 #10 0x000055c765022233 in row_log_table_apply_ops (thr=0x7fa07401d658, dup=0x7fa0a8c9f490, stage=0x7fa074020818) at /data/src/10.5/storage/innobase/row/row0log.cc:3043 #11 0x000055c765022988 in row_log_table_apply (thr=0x7fa07401d658, old_table=0x7fa07021b7d8, table=0x7fa0a8ca0910, stage=0x7fa074020818, new_table=0x7fa07401df18) at /data/src/10.5/storage/innobase/row/row0log.cc:3149 #12 0x000055c764ea60c6 in alter_rebuild_apply_log (ctx=0x7fa0740137f0, ha_alter_info=0x7fa0a8ca0870, altered_table=0x7fa0a8ca0910) at /data/src/10.5/storage/innobase/handler/handler0alter.cc:10633 #13 0x000055c764ea6fc1 in ha_innobase::commit_inplace_alter_table (this=0x7fa0700faa00, altered_table=0x7fa0a8ca0910, ha_alter_info=0x7fa0a8ca0870, commit=true) at /data/src/10.5/storage/innobase/handler/handler0alter.cc:10832 #14 0x000055c7649fadf9 in handler::ha_commit_inplace_alter_table (this=0x7fa0700faa00, altered_table=0x7fa0a8ca0910, ha_alter_info=0x7fa0a8ca0870, commit=true) at /data/src/10.5/sql/handler.cc:4843 #15 0x000055c7647918ae in mysql_inplace_alter_table (thd=0x7fa074000b18, table_list=0x7fa074012548, table=0x7fa07021b248, altered_table=0x7fa0a8ca0910, ha_alter_info=0x7fa0a8ca0870, inplace_supported=HA_ALTER_INPLACE_COPY_NO_LOCK, target_mdl_request=0x7fa0a8ca0ce0, alter_ctx=0x7fa0a8ca1830) at /data/src/10.5/sql/sql_table.cc:8038 #16 0x000055c7647990d9 in mysql_alter_table (thd=0x7fa074000b18, new_db=0x7fa0740053d8, new_name=0x7fa0740057e0, create_info=0x7fa0a8ca2430, table_list=0x7fa074012548, alter_info=0x7fa0a8ca2360, order_num=0, order=0x0, ignore=false, if_exists=false) at /data/src/10.5/sql/sql_table.cc:10564 #17 0x000055c76483e97a in Sql_cmd_alter_table::execute (this=0x7fa074012c20, thd=0x7fa074000b18) at /data/src/10.5/sql/sql_alter.cc:538 #18 0x000055c76469da0c in mysql_execute_command (thd=0x7fa074000b18) at /data/src/10.5/sql/sql_parse.cc:5951 #19 0x000055c7646a3d4d in mysql_parse (thd=0x7fa074000b18, rawbuf=0x7fa074012470 "ALTER TABLE t1 FORCE", length=20, parser_state=0x7fa0a8ca3520, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:7993 #20 0x000055c764690046 in dispatch_command (command=COM_QUERY, thd=0x7fa074000b18, packet=0x7fa0740088b9 "ALTER TABLE t1 FORCE", packet_length=20, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:1867 #21 0x000055c76468e77e in do_command (thd=0x7fa074000b18) at /data/src/10.5/sql/sql_parse.cc:1348 #22 0x000055c76483407a in do_handle_one_connection (connect=0x55c7684c5278, put_in_cache=true) at /data/src/10.5/sql/sql_connect.cc:1410 #23 0x000055c764833de2 in handle_one_connection (arg=0x55c7684c5278) at /data/src/10.5/sql/sql_connect.cc:1312 #24 0x000055c764d72126 in pfs_spawn_thread (arg=0x55c7684c4ea8) at /data/src/10.5/storage/perfschema/pfs.cc:2201 #25 0x00007fa0b16f24a4 in start_thread (arg=0x7fa0a8ca4700) at pthread_create.c:456 #26 0x00007fa0af826d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
            elenst Elena Stepanova made changes -
            Comment [ A comment with security level 'Developers' was removed. ]
            elenst Elena Stepanova made changes -
            Labels need_rr rr-profile-analyzed rr-profile rr-profile-analyzed

            For the 10.5 failure that involves ALTER TABLE t1 FORCE, I believe that the issue is that we are wrongly allowing LOCK=NONE when rebuilding the table that is lacking a PRIMARY KEY. At the time of the assertion failure, the table is empty. I did not see any wrongdoing on the SQL layer. No unexpected DML transaction rollback or incorrect release of MDL.

            The 10.1 failure is a little different, because the table is not empty. But, nevertheless, also for the ALTER TABLE t1 DROP b we should refuse any concurrent DML, because the table only defines the hidden primary key GEN_CLUST_INDEX(DB_ROW_ID).

            elenst, did you ever repeat this with a table that defines a PRIMARY KEY or a UNIQUE INDEX solely on NOT NULL columns?

            marko Marko Mäkelä added a comment - For the 10.5 failure that involves ALTER TABLE t1 FORCE , I believe that the issue is that we are wrongly allowing LOCK=NONE when rebuilding the table that is lacking a PRIMARY KEY . At the time of the assertion failure, the table is empty. I did not see any wrongdoing on the SQL layer. No unexpected DML transaction rollback or incorrect release of MDL. The 10.1 failure is a little different, because the table is not empty. But, nevertheless, also for the ALTER TABLE t1 DROP b we should refuse any concurrent DML, because the table only defines the hidden primary key GEN_CLUST_INDEX(DB_ROW_ID) . elenst , did you ever repeat this with a table that defines a PRIMARY KEY or a UNIQUE INDEX solely on NOT NULL columns?
            marko Marko Mäkelä made changes -
            Labels rr-profile rr-profile-analyzed rr-profile-analyzed
            elenst Elena Stepanova added a comment - - edited

            marko,
            It is reproducible the same way (same assertion in the same line of 10.5) with this test case:

            --source include/have_innodb.inc
             
            CREATE TABLE t1 (pk INT NOT NULL PRIMARY KEY, a INT) ENGINE=InnoDB;
            XA BEGIN 'x';
            INSERT INTO t1 VALUES (1,1),(2,2);
             
            --connect (con1,localhost,root,,test)
            SET lock_wait_timeout= 1;
            --send
              ALTER TABLE t1 FORCE;
             
            --connection default
            --error ER_NO_SUCH_TABLE
            UPDATE IGNORE xx SET x = 0;
            --error ER_XAER_RMFAIL
            FLUSH TABLES;
            UPDATE t1 SET a = 3;
            XA END 'x';
            XA ROLLBACK 'x';
             
            --connection con1
            --reap
             
            # Cleanup
            --disconnect con1
            --connection default
            DROP TABLE t1;
            

            Fails on 10.1 too.

            elenst Elena Stepanova added a comment - - edited marko , It is reproducible the same way (same assertion in the same line of 10.5) with this test case: --source include/have_innodb.inc CREATE TABLE t1 (pk INT NOT NULL PRIMARY KEY , a INT ) ENGINE=InnoDB; XA BEGIN 'x' ; INSERT INTO t1 VALUES (1,1),(2,2); --connect (con1,localhost,root,,test) SET lock_wait_timeout= 1; --send ALTER TABLE t1 FORCE ; --connection default --error ER_NO_SUCH_TABLE UPDATE IGNORE xx SET x = 0; --error ER_XAER_RMFAIL FLUSH TABLES; UPDATE t1 SET a = 3; XA END 'x' ; XA ROLLBACK 'x' ; --connection con1 --reap # Cleanup --disconnect con1 --connection default DROP TABLE t1; Fails on 10.1 too.
            marko Marko Mäkelä added a comment - - edited

            I filed MDEV-23310 for the bug that I think should be repeatable without involving XA transactions. But, in the end, I determined that there is no problem to allow an online table rebuild when no PRIMARY KEY exists. The operation is like any table rebuild where the PRIMARY KEY is being preserved.

            That being said, it is more convenient to debug cases where an explicit PRIMARY KEY exists.

            marko Marko Mäkelä added a comment - - edited I filed MDEV-23310 for the bug that I think should be repeatable without involving XA transactions. But, in the end, I determined that there is no problem to allow an online table rebuild when no PRIMARY KEY exists. The operation is like any table rebuild where the PRIMARY KEY is being preserved. That being said, it is more convenient to debug cases where an explicit PRIMARY KEY exists.
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]

            I can repeat the crash on 10.2 with the latest test case. It failed on the first try both with and without the -rr option. The log>tail.block contains the following records:
            ROW_T_UPDATE(pk=1,DB_TRX_ID=0x504,DB_ROLL_PTR=(update),a=3)
            ROW_T_UPDATE(pk=2,DB_TRX_ID=0x504,DB_ROLL_PTR=(update),a=3)
            ROW_T_UPDATE(pk=2,DB_TRX_ID=0x504,DB_ROLL_PTR=(insert2),a=2)
            ROW_T_UPDATE(pk=1,DB_TRX_ID=0x504,DB_ROLL_PTR=(insert1),a=1)
            ROW_T_DELETE(pk=2,DB_TRX_ID=0x504,DB_ROLL_PTR=(insert2))
            ROW_T_DELETE(pk=1,DB_TRX_ID=0x504,DB_ROLL_PTR=(insert1))

            In this log, everything would be fine if also the initial two INSERT rows of the XA BEGIN 'x' transaction were present. In that case, the online ALTER would have copied the empty table and then applied the log for the concurrent rolled-back transaction, ending up with an empty table again.

            Clearly, something is wrongly allowing the ALTER TABLE t1 FORCE to start while that transaction is active and accessing the table. As expected, the MDL was incorrectly released by a statement that returned an error. Maybe the erroneous UPDATE is needed as well, but the FLUSH TABLES is what will violate the design constraints:

            10.2 3c3f172f17de217389bc51053c555a5d7e91ddff

            Thread 29 hit Breakpoint 1, MDL_context::release_transactional_locks (
                this=0x7f7900000e48) at /mariadb/10.2o/sql/mdl.cc:2843
            2843	{
            (rr) bt
            #0  MDL_context::release_transactional_locks (this=0x7f7900000e48)
                at /mariadb/10.2o/sql/mdl.cc:2843
            #1  0x000055f2bce3ce1d in mysql_execute_command (thd=0x7f7900000d28)
                at /mariadb/10.2o/sql/sql_parse.cc:6080
            #2  0x000055f2bce33564 in mysql_parse (thd=0x7f7900000d28, 
                rawbuf=0x7f7900011970 "FLUSH TABLES", length=<optimized out>, 
                parser_state=0x7f796405d6d0, is_com_multi=<optimized out>, 
                is_next_command=false) at /mariadb/10.2o/sql/sql_parse.cc:7733
            …
            (rr) break ha_innobase::prepare_inplace_alter_table
            Breakpoint 2 at 0x55f2bd1e8570: file /mariadb/10.2o/storage/innobase/handler/handler0alter.cc, line 5326.
            (rr) when
            Current event: 118228
            (rr) c
            Continuing.
            [Switching to Thread 539985.540023]
             
            Thread 3 hit Breakpoint 2, ha_innobase::prepare_inplace_alter_table (
            

            The MDL is being released in a different piece of the code than in MDEV-15532:

              else if (stmt_causes_implicit_commit(thd, CF_IMPLICIT_COMMIT_END))
              {
                /* No transaction control allowed in sub-statements. */
                DBUG_ASSERT(! thd->in_sub_stmt);
                if (!(thd->variables.option_bits & OPTION_GTID_BEGIN))
                {
                  /* If commit fails, we should be able to reset the OK status. */
                  thd->get_stmt_da()->set_overwrite_status(true);
                  /* Commit the normal transaction if one is active. */
                  trans_commit_implicit(thd);
                  thd->get_stmt_da()->set_overwrite_status(false);
                  thd->mdl_context.release_transactional_locks();
                }
              }
            

            marko Marko Mäkelä added a comment - I can repeat the crash on 10.2 with the latest test case. It failed on the first try both with and without the - rr option. The log >tail.block contains the following records: ROW_T_UPDATE(pk=1,DB_TRX_ID=0x504,DB_ROLL_PTR=(update),a=3) ROW_T_UPDATE(pk=2,DB_TRX_ID=0x504,DB_ROLL_PTR=(update),a=3) ROW_T_UPDATE(pk=2,DB_TRX_ID=0x504,DB_ROLL_PTR=(insert2),a=2) ROW_T_UPDATE(pk=1,DB_TRX_ID=0x504,DB_ROLL_PTR=(insert1),a=1) ROW_T_DELETE(pk=2,DB_TRX_ID=0x504,DB_ROLL_PTR=(insert2)) ROW_T_DELETE(pk=1,DB_TRX_ID=0x504,DB_ROLL_PTR=(insert1)) In this log, everything would be fine if also the initial two INSERT rows of the XA BEGIN 'x' transaction were present. In that case, the online ALTER would have copied the empty table and then applied the log for the concurrent rolled-back transaction, ending up with an empty table again. Clearly, something is wrongly allowing the ALTER TABLE t1 FORCE to start while that transaction is active and accessing the table. As expected, the MDL was incorrectly released by a statement that returned an error. Maybe the erroneous UPDATE is needed as well, but the FLUSH TABLES is what will violate the design constraints: 10.2 3c3f172f17de217389bc51053c555a5d7e91ddff Thread 29 hit Breakpoint 1, MDL_context::release_transactional_locks ( this=0x7f7900000e48) at /mariadb/10.2o/sql/mdl.cc:2843 2843 { (rr) bt #0 MDL_context::release_transactional_locks (this=0x7f7900000e48) at /mariadb/10.2o/sql/mdl.cc:2843 #1 0x000055f2bce3ce1d in mysql_execute_command (thd=0x7f7900000d28) at /mariadb/10.2o/sql/sql_parse.cc:6080 #2 0x000055f2bce33564 in mysql_parse (thd=0x7f7900000d28, rawbuf=0x7f7900011970 "FLUSH TABLES", length=<optimized out>, parser_state=0x7f796405d6d0, is_com_multi=<optimized out>, is_next_command=false) at /mariadb/10.2o/sql/sql_parse.cc:7733 … (rr) break ha_innobase::prepare_inplace_alter_table Breakpoint 2 at 0x55f2bd1e8570: file /mariadb/10.2o/storage/innobase/handler/handler0alter.cc, line 5326. (rr) when Current event: 118228 (rr) c Continuing. [Switching to Thread 539985.540023]   Thread 3 hit Breakpoint 2, ha_innobase::prepare_inplace_alter_table ( The MDL is being released in a different piece of the code than in MDEV-15532 : else if (stmt_causes_implicit_commit(thd, CF_IMPLICIT_COMMIT_END)) { /* No transaction control allowed in sub-statements. */ DBUG_ASSERT(! thd->in_sub_stmt); if (!(thd->variables.option_bits & OPTION_GTID_BEGIN)) { /* If commit fails, we should be able to reset the OK status. */ thd->get_stmt_da()->set_overwrite_status( true ); /* Commit the normal transaction if one is active. */ trans_commit_implicit(thd); thd->get_stmt_da()->set_overwrite_status( false ); thd->mdl_context.release_transactional_locks(); } }
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Sergei Golubchik [ serg ]
            marko Marko Mäkelä made changes -
            Assignee Sergei Golubchik [ serg ] Vladislav Vaintroub [ wlad ]
            marko Marko Mäkelä made changes -
            julien.fritsch Julien Fritsch made changes -
            Priority Major [ 3 ] Critical [ 2 ]

            Not able to reproduce the mtr example crash, seems to be fixed with MDEV-14693

            wlad Vladislav Vaintroub added a comment - Not able to reproduce the mtr example crash, seems to be fixed with MDEV-14693
            wlad Vladislav Vaintroub made changes -
            issue.field.resolutiondate 2020-12-04 09:32:55.0 2020-12-04 09:32:55.297
            wlad Vladislav Vaintroub made changes -
            Fix Version/s 10.2.37 [ 25112 ]
            Fix Version/s 10.3.28 [ 25111 ]
            Fix Version/s 10.4.18 [ 25110 ]
            Fix Version/s 10.5.9 [ 25109 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.3 [ 22126 ]
            Fix Version/s 10.4 [ 22408 ]
            Fix Version/s 10.5 [ 23123 ]
            Resolution Fixed [ 1 ]
            Status Confirmed [ 10101 ] Closed [ 6 ]

            It was actually fixed with MDEV-15532.
            It is still easily enough reproducible on a previous commit, but not since this one:

            commit 828471cbf83774f4537a78551290b7a4a7f5d374
            Author: Monty
            Date:   Mon Nov 30 15:29:32 2020 +0200
             
                MDEV 15532 Assertion `!log->same_pk' failed in row_log_table_apply_delete
            

            elenst Elena Stepanova added a comment - It was actually fixed with MDEV-15532 . It is still easily enough reproducible on a previous commit, but not since this one: commit 828471cbf83774f4537a78551290b7a4a7f5d374 Author: Monty Date: Mon Nov 30 15:29:32 2020 +0200   MDEV 15532 Assertion `!log->same_pk' failed in row_log_table_apply_delete

            yes, a small mixup.

            wlad Vladislav Vaintroub added a comment - yes, a small mixup.
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 85956 ] MariaDB v4 [ 153934 ]

            People

              wlad Vladislav Vaintroub
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.