Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32100

Online ALTER TABLE ends with 1032 (Can't find record) under some isolation levels

Details

    Description

      origin/11.3, 11.3 8ad1e26b1bafa4ed9928306efc10c047f2274108 2023-09-04T09:36:44+04:00
      origin/bb-11.3-MDEV-28699 d2947464c841d0a989098010e04bfda7dcf7df58 2023-09-04T21:41:01+05:30 (rr traces)
      Per Marko and Thiru:
      The problems are most probably caused by MDEV-16329.
       
      Scenario:
      1. Start the server and generate some initial data.
      2. Several session run concurrent some DDL/DML mix
          Sometimes this kind of test fails with
           ALTER TABLE t3 MODIFY COLUMN col2 INT NOT NULL, LOCK = DEFAULT  /* E_R Thread2 QNO 2894 CON_ID 15 */  failed: 1032 , errstr: Can't find record in 't3'
           pluto:/data/results/1693844534/TBR-403--TBR-403D_1$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio
          or
          ALTER TABLE t1 DROP PRIMARY KEY, ALGORITHM = DEFAULT, LOCK = DEFAULT  /* E_R Thread16 QNO 59 CON_ID 28 */  failed: 1032 , errstr: Can't find record in 't1'
          pluto:/data/results/1693844534/TBR-403--TBR-403F--TBR-403D_better$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio
       
      One of the various test setups which showed the trouble
      # git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG
      #
      # GIT_SHOW: HEAD -> master, origin/master, origin/HEAD d5889a108c00974367f47657cabe4804fe7df566 2023-09-04T13:53:19+02:00
      # rqg.pl  : Version 4.4.0 (2023-08)
      #
      # $RQG_HOME/rqg.pl \
      # --grammar=conf/mariadb/table_stress_innodb.yy \
      # --gendata=conf/mariadb/table_stress.zz \
      # --gendata_sql=conf/mariadb/table_stress.sql \
      # --mysqld=--transaction-isolation=SERIALIZABLE \
      # --validator=SelectStability \
      # --redefine=conf/mariadb/redefine_innodb_truncate_temporary_tablespace_flipper.yy \    <==  not required
      # --redefine=conf/mariadb/redefine_temporary_tables.yy \                                                               <==  not required
      # --rounds=2 \                                                                                                                                                      <==  not required
      # --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
      # --mysqld=--loose-idle_write_transaction_timeout=0 \
      # --mysqld=--loose-idle_transaction_timeout=0 \
      # --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      # --mysqld=--connect_timeout=60 \
      # --mysqld=--interactive_timeout=28800 \
      # --mysqld=--slave_net_timeout=60 \
      # --mysqld=--net_read_timeout=30 \
      # --mysqld=--net_write_timeout=60 \
      # --mysqld=--loose-table_lock_wait_timeout=50 \
      # --mysqld=--wait_timeout=28800 \
      # --no_mask \
      # --queries=10000000 \
      # --seed=random \
      # --reporters=None \
      # --reporters=ErrorLog \
      # --reporters=Deadlock \
      # --validators=None \
      # --mysqld=--log_output=none \
      # --mysqld=--log_bin_trust_function_creators=1 \
      # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      # --engine=InnoDB \
      # --restart_timeout=240 \
      # --mysqld=--plugin-load-add=file_key_management.so \
      # --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
      # --mysqld=--plugin-load-add=provider_lzo.so \
      # --mysqld=--plugin-load-add=provider_bzip2.so \
      # --mysqld=--plugin-load-add=provider_lzma.so \
      # --mysqld=--plugin-load-add=provider_snappy.so \
      # --mysqld=--plugin-load-add=provider_lz4.so \
      # --mysqld=--loose-innodb_compression_level=1 \
      # --duration=300 \
      # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      # --mysqld=--lock-wait-timeout=86400 \
      # --mysqld=--innodb-lock-wait-timeout=50 \
      # --mysqld=--sql_mode=traditional \
      # --mysqld=--innodb_file_per_table=1 \
      # --mysqld=--loose-innodb_read_only_compressed=OFF \
      # --mysqld=--innodb_stats_persistent=on \
      # --mysqld=--innodb_adaptive_hash_index=off \
      # --mysqld=--innodb_random_read_ahead=ON \
      # --mysqld=--log-bin \
      # --mysqld=--sync-binlog=1 \
      # --mysqld=--loose-innodb_evict_tables_on_commit_debug=off \
      # --mysqld=--loose-max-statement-time=30 \
      # --threads=2 \
      # --mysqld=--innodb-use-native-aio=0 \
      # --mysqld=--loose-gdb \
      # --mysqld=--loose-debug-gdb \
      # --rr=Extended \
      # --rr_options=--wait \
      # --mysqld=--innodb_undo_log_truncate=OFF \
      # --mysqld=--loose_innodb_change_buffering=deletes \
      # --mysqld=--innodb_undo_tablespaces=16 \
      # --mysqld=--innodb_rollback_on_timeout=ON \
      # --mysqld=--innodb_page_size=8K \
      # --mysqld=--innodb-buffer-pool-size=256M \
      # <local settings>
          
      

      Attachments

        Activity

          Here goes the deterministic test:

          --source include/have_innodb.inc
           
          create table t1 (a int, b int, key(b)) engine=innodb;
           
          --connect (con1,localhost,root,,test)
          insert into t1 values (1,1),(null,null),(3,3),(4,null),(null,5);
           
          --connection default
          set session transaction isolation level read committed;
          set debug_sync= "alter_table_online_downgraded signal downgraded wait_for goalters";
           
          --send
          alter table t1 force, algorithm=copy;
           
          --connection con1
          set debug_sync= "now wait_for downgraded";
          delete from t1 where b is null;
          set debug_sync= "now signal goalters";
           
          --connection default
          --reap
           
          drop table t1;
          

          Few notes:
          1. READ COMMITTED works as good as SERIALIZEABLE
          2. One may find that the first DELETE, searching by a (WHERE a IS NULL) is missing in the test.
          It is not possible to reproduce the test with the subj, as it just serializes with ALTER. Probably, the reason is that it makes searches by the clustered index instead of secondary key. It's surprising, though, that searching by key doesn't lock that much. Maybe marko can know why.

          No #3, pardon

          nikitamalyavin Nikita Malyavin added a comment - Here goes the deterministic test: --source include/have_innodb.inc   create table t1 (a int , b int , key (b)) engine=innodb;   --connect (con1,localhost,root,,test) insert into t1 values (1,1),( null , null ),(3,3),(4, null ),( null ,5);   --connection default set session transaction isolation level read committed ; set debug_sync= "alter_table_online_downgraded signal downgraded wait_for goalters" ;   --send alter table t1 force , algorithm=copy;   --connection con1 set debug_sync= "now wait_for downgraded" ; delete from t1 where b is null ; set debug_sync= "now signal goalters" ;   --connection default --reap   drop table t1; Few notes: 1. READ COMMITTED works as good as SERIALIZEABLE 2. One may find that the first DELETE , searching by a ( WHERE a IS NULL ) is missing in the test. It is not possible to reproduce the test with the subj, as it just serializes with ALTER. Probably, the reason is that it makes searches by the clustered index instead of secondary key. It's surprising, though, that searching by key doesn't lock that much. Maybe marko can know why. No #3, pardon

          1. READ COMMITTED works as good as SERIALIZEABLE

          And READ UNCOMMITTED too (for completeness.).

          elenst Elena Stepanova added a comment - 1. READ COMMITTED works as good as SERIALIZEABLE And READ UNCOMMITTED too (for completeness.).

          So far, the INPLACE innodb's trx is initialized during the INPLACE ALTER TABLE with the default isolation_level, which is TRX_ISO_REPEATABLE_READ,
          whereas COPY ALTER TABLE uses trx assigned to current thd.

          (gdb) bt
          #0  0x000055a77f5f567a in ha_innobase_inplace_ctx::ha_innobase_inplace_ctx (this=0x6290000fc1a8, prebuilt_arg=@0x61d000284e90: 0x620000020120, drop_arg=0x0, num_to_drop_arg=0, drop_fk_arg=0x0, num_to_drop_fk_arg=0, add_fk_arg=0x0, num_to_add_fk_arg=0, online_arg=true, heap_arg=0x61a000025898, new_table_arg=0x61800003b120, col_names_arg=0x0, add_autoinc_arg=18446744073709551615, autoinc_col_min_value_arg=0, autoinc_col_max_value_arg=0, allow_not_null_flag=false, page_compressed=false, page_compression_level_arg=0) at /home/nik/mariadb/storage/innobase/handler/handler0alter.cc:1118
          #1  0x000055a77f59adbd in ha_innobase::prepare_inplace_alter_table (this=0x61d0002846b8, altered_table=0x7fd66f25d810, ha_alter_info=0x7fd66f25d080) at /home/nik/mariadb/storage/innobase/handler/handler0alter.cc:8621
          #2  0x000055a77d511b7e in handler::ha_prepare_inplace_alter_table (this=0x61d0002846b8, altered_table=0x7fd66f25d810, ha_alter_info=0x7fd66f25d080) at /home/nik/mariadb/sql/handler.cc:5429
          #3  0x000055a77e3b6060 in mysql_inplace_alter_table (thd=0x62b00017a218, table_list=0x6290000fa398, table=0x6190000c1798, altered_table=0x7fd66f25d810, ha_alter_info=0x7fd66f25d080, target_mdl_request=0x7fd66f25b6b0, ddl_log_state=0x7fd66f25b040, trigger_param=0x7fd66f25b8e0, alter_ctx=0x7fd66f25bd70, partial_alter=@0x7fd66f25b0c0: false, start_alter_id=@0x7fd66f25b0d0: 0, if_exists=false) at /home/nik/mariadb/sql/sql_table.cc:7679
          #4  0x000055a77e39f4ff in mysql_alter_table (thd=0x62b00017a218, new_db=0x62b00017efe0, new_name=0x62b00017f430, create_info=0x7fd66f028020, table_list=0x6290000fa398, recreate_info=0x7fd66f028c80, alter_info=0x7fd66f028320, order_num=0, order=0x0, ignore=false, if_exists=false) at /home/nik/mariadb/sql/sql_table.cc:11020
          #5  0x000055a77e64b1d6 in Sql_cmd_alter_table::execute (this=0x6290000fad10, thd=0x62b00017a218) at /home/nik/mariadb/sql/sql_alter.cc:615
          #6  0x000055a77dffd17a in mysql_execute_command (thd=0x62b00017a218, is_called_from_prepared_stmt=false) at /home/nik/mariadb/sql/sql_parse.cc:5775
          #7  0x000055a77dfdc10b in mysql_parse (thd=0x62b00017a218, rawbuf=0x6290000fa238 "alter table t1 add c int, add key(c), algorithm=inplace, lock=none", length=66, parser_state=0x7fd66f1677f0) at /home/nik/mariadb/sql/sql_parse.cc:7811
          #8  0x000055a77dfd5b8e in dispatch_command (command=COM_QUERY, thd=0x62b00017a218, packet=0x629000276219 "alter table t1 add c int, add key(c), algorithm=inplace, lock=none", packet_length=66, blocking=true) at /home/nik/mariadb/sql/sql_parse.cc:1893
          #9  0x000055a77dfddeda in do_command (thd=0x62b00017a218, blocking=true) at /home/nik/mariadb/sql/sql_parse.cc:1406
          #10 0x000055a77e622536 in do_handle_one_connection (connect=0x608000002e38, put_in_cache=true) at /home/nik/mariadb/sql/sql_connect.cc:1445
          #11 0x000055a77e621ccf in handle_one_connection (arg=0x608000002db8) at /home/nik/mariadb/sql/sql_connect.cc:1347
          

          nikitamalyavin Nikita Malyavin added a comment - So far, the INPLACE innodb's trx is initialized during the INPLACE ALTER TABLE with the default isolation_level , which is TRX_ISO_REPEATABLE_READ , whereas COPY ALTER TABLE uses trx assigned to current thd . (gdb) bt #0 0x000055a77f5f567a in ha_innobase_inplace_ctx::ha_innobase_inplace_ctx (this=0x6290000fc1a8, prebuilt_arg=@0x61d000284e90: 0x620000020120, drop_arg=0x0, num_to_drop_arg=0, drop_fk_arg=0x0, num_to_drop_fk_arg=0, add_fk_arg=0x0, num_to_add_fk_arg=0, online_arg=true, heap_arg=0x61a000025898, new_table_arg=0x61800003b120, col_names_arg=0x0, add_autoinc_arg=18446744073709551615, autoinc_col_min_value_arg=0, autoinc_col_max_value_arg=0, allow_not_null_flag=false, page_compressed=false, page_compression_level_arg=0) at /home/nik/mariadb/storage/innobase/handler/handler0alter.cc:1118 #1 0x000055a77f59adbd in ha_innobase::prepare_inplace_alter_table (this=0x61d0002846b8, altered_table=0x7fd66f25d810, ha_alter_info=0x7fd66f25d080) at /home/nik/mariadb/storage/innobase/handler/handler0alter.cc:8621 #2 0x000055a77d511b7e in handler::ha_prepare_inplace_alter_table (this=0x61d0002846b8, altered_table=0x7fd66f25d810, ha_alter_info=0x7fd66f25d080) at /home/nik/mariadb/sql/handler.cc:5429 #3 0x000055a77e3b6060 in mysql_inplace_alter_table (thd=0x62b00017a218, table_list=0x6290000fa398, table=0x6190000c1798, altered_table=0x7fd66f25d810, ha_alter_info=0x7fd66f25d080, target_mdl_request=0x7fd66f25b6b0, ddl_log_state=0x7fd66f25b040, trigger_param=0x7fd66f25b8e0, alter_ctx=0x7fd66f25bd70, partial_alter=@0x7fd66f25b0c0: false, start_alter_id=@0x7fd66f25b0d0: 0, if_exists=false) at /home/nik/mariadb/sql/sql_table.cc:7679 #4 0x000055a77e39f4ff in mysql_alter_table (thd=0x62b00017a218, new_db=0x62b00017efe0, new_name=0x62b00017f430, create_info=0x7fd66f028020, table_list=0x6290000fa398, recreate_info=0x7fd66f028c80, alter_info=0x7fd66f028320, order_num=0, order=0x0, ignore=false, if_exists=false) at /home/nik/mariadb/sql/sql_table.cc:11020 #5 0x000055a77e64b1d6 in Sql_cmd_alter_table::execute (this=0x6290000fad10, thd=0x62b00017a218) at /home/nik/mariadb/sql/sql_alter.cc:615 #6 0x000055a77dffd17a in mysql_execute_command (thd=0x62b00017a218, is_called_from_prepared_stmt=false) at /home/nik/mariadb/sql/sql_parse.cc:5775 #7 0x000055a77dfdc10b in mysql_parse (thd=0x62b00017a218, rawbuf=0x6290000fa238 "alter table t1 add c int, add key(c), algorithm=inplace, lock=none", length=66, parser_state=0x7fd66f1677f0) at /home/nik/mariadb/sql/sql_parse.cc:7811 #8 0x000055a77dfd5b8e in dispatch_command (command=COM_QUERY, thd=0x62b00017a218, packet=0x629000276219 "alter table t1 add c int, add key(c), algorithm=inplace, lock=none", packet_length=66, blocking=true) at /home/nik/mariadb/sql/sql_parse.cc:1893 #9 0x000055a77dfddeda in do_command (thd=0x62b00017a218, blocking=true) at /home/nik/mariadb/sql/sql_parse.cc:1406 #10 0x000055a77e622536 in do_handle_one_connection (connect=0x608000002e38, put_in_cache=true) at /home/nik/mariadb/sql/sql_connect.cc:1445 #11 0x000055a77e621ccf in handle_one_connection (arg=0x608000002db8) at /home/nik/mariadb/sql/sql_connect.cc:1347

          Please review branch bb-11.2-nikita

          nikitamalyavin Nikita Malyavin added a comment - Please review branch bb-11.2-nikita

          fb158d2fece is ok to push

          serg Sergei Golubchik added a comment - fb158d2fece is ok to push

          People

            nikitamalyavin Nikita Malyavin
            mleich Matthias Leich
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.