Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31347

fil_ibd_create() may hijack the file handle of an old file

Details

    Description

      origin/10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf 2023-05-23T12:25:39+03:00 + some patch (pluto:/Server/purge.patch).
      The same problem (maybe different backtrace) was replayed on
       origin/10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf 2023-05-23T12:25:39+03:00
      origin/bb-10.6-MDEV-26782 1ed645614b8895857f28839807f2d42c9220b936 2023-04-21T14:14:30+03:00
      origin/bb-10.6-MDEV-26782 3d589c462b7e17145e055ce9066bbf27856ca215 2023-04-20T09:14:17+03:00
      origin/bb-10.6-MDEV-30996 5f7dfabdaa658ef77c5b511e25134b60e9a129c0 2023-04-19T15:13:39+05:30
      origin/bb-10.6-MDEV-31234 ece84a7218c153d2c0fa5c2d33605304bc441fb9 2023-05-22T13:17:49+03:00
      origin/bb-10.9-MDEV-29911 e9a6dc7ae59dec89bdbb1d60bcbdc76bc3d583c8 2023-05-11T14:29:56+03:00
      origin/10.6 2ddfb838078cb0f81b48ef878d97b966c8d9c494 2023-04-12T19:07:59+05:30
       
      Scenario:
      1. Start the server and generate some initial data.
      2. Two sessions run concurrent
                    ALTER TABLE <some table created+filled in 1.> FORCE ;
       
      The server error log ends with
      # 2023-05-25T15:48:34 [1227458] | [rr 1231496 1161866]2023-05-25 15:17:04 14 [Note] InnoDB: Cannot close file ./test/#sql-alter-12ca88-e.ibd because of 1 pending operations
      # 2023-05-25T15:48:34 [1227458] | [rr 1231496 1161936]2023-05-25 15:17:04 16 [ERROR] InnoDB: Operating system error number 9 in a file operation.
      # 2023-05-25T15:48:34 [1227458] | [rr 1231496 1161940]2023-05-25 15:17:04 16 [ERROR] InnoDB: Error number 9 means 'Bad file descriptor'
      # 2023-05-25T15:48:34 [1227458] | [rr 1231496 1161944]2023-05-25 15:17:04 16 [Note] InnoDB: Some operating system error numbers are described at https://mariadb.com/kb/en/library/operating-system-error-codes/
      # 2023-05-25T15:48:34 [1227458] | [rr 1231496 1161948]2023-05-25 15:17:04 16 [ERROR] InnoDB: File (unknown): 'close' returned OS error 209. Cannot continue operation
      # 2023-05-25T15:48:34 [1227458] | [rr 1231496 1161967]230525 15:17:04 [rr 1231496 1161970][ERROR] mysqld got signal 6 ;
       
      Query (0x62b000181238): ALTER TABLE `t7` FORCE /* E_R Thread2 QNO 9 CON_ID 16 */
      Connection ID (thread ID): 16
      Status: KILL_TIMEOUT
       
      (rr) bt
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #1  0x000055d697360859 in __GI_abort () at abort.c:79
      #2  0x000055d694afd497 in os_file_handle_error_cond_exit (name=name@entry=0x0, operation=operation@entry=0x55d695c113c0 "close", should_abort=should_abort@entry=true, on_error_silent=on_error_silent@entry=false)
          at /data/Server/10.6_with_patch/storage/innobase/os/os0file.cc:3033
      #3  0x000055d694afd5da in os_file_handle_error (name=name@entry=0x0, operation=operation@entry=0x55d695c113c0 "close") at /data/Server/10.6_with_patch/storage/innobase/os/os0file.cc:205
      #4  0x000055d694afd60b in os_file_close_func (file=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/os/os0file.cc:1456
      #5  0x000055d6950a285c in fil_node_t::close (this=this@entry=0x6060001c9d60) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:96
      #6  0x000055d6950aac8a in fil_space_t::try_to_close (print_info=print_info@entry=true) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:129
      #7  0x000055d6950b1484 in fil_node_open_file (node=node@entry=0x6060000fea80) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:427
      #8  0x000055d6950b1acd in fil_space_t::prepare_acquired (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101
      #9  0x000055d6950b1e68 in fil_space_t::acquire_and_prepare (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:711
      #10 0x000055d6950b2242 in fil_space_t::acquire (this=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:722
      #11 fil_space_extend (space=space@entry=0x6120000448c0, size=size@entry=7) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:724
      #12 0x000055d6950edda2 in fsp_try_extend_data_file_with_pages (space=space@entry=0x6120000448c0, page_no=page_no@entry=6, header=header@entry=0x13de2229fdc0, mtr=mtr@entry=0x48ab569eed90)
          at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:606
      #13 0x000055d6950ee372 in fsp_reserve_free_pages (space=space@entry=0x6120000448c0, header=header@entry=0x13de2229fdc0, size=size@entry=5, mtr=mtr@entry=0x48ab569eed90, n_pages=n_pages@entry=2)
          at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:2337
      #14 0x000055d6950ef0fd in fsp_reserve_free_extents (n_reserved=n_reserved@entry=0x48ab569eed70, space=0x6120000448c0, n_ext=n_ext@entry=1, alloc_type=alloc_type@entry=FSP_NORMAL, mtr=mtr@entry=0x48ab569eed90, 
          n_pages=n_pages@entry=2) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:2414
      #15 0x000055d694e96953 in PageBulk::init (this=this@entry=0x619001595a80) at /data/Server/10.6_with_patch/storage/innobase/btr/btr0bulk.cc:64
      #16 0x000055d694e9a9e7 in BtrBulk::insert (this=this@entry=0x6040000046d0, tuple=tuple@entry=0x6310004888b8, level=level@entry=0) at /data/Server/10.6_with_patch/storage/innobase/btr/btr0bulk.cc:1023
      #17 0x000055d694c40913 in BtrBulk::insert (tuple=0x6310004888b8, this=0x6040000046d0) at /data/Server/10.6_with_patch/storage/innobase/include/btr0bulk.h:301
      #18 row_merge_insert_index_tuples (index=0x61600b374f08, old_table=old_table@entry=0x6180000c7d08, fd=..., block=block@entry=0x0, row_buf=row_buf@entry=0x615000052188, btr_bulk=<optimized out>, 
          table_total_rows=<optimized out>, pct_progress=pct_progress@entry=17.685342388228637, pct_cost=pct_cost@entry=33.333333333333336, crypt_block=<optimized out>, space=<optimized out>, stage=<optimized out>)
          at /data/Server/10.6_with_patch/storage/innobase/row/row0merge.cc:3667
      #19 0x000055d694c524f7 in row_merge_read_clustered_index (trx=trx@entry=0x267b1ba2c440, table=table@entry=0x48ab569f2370, old_table=old_table@entry=0x6180000c7d08, new_table=new_table@entry=0x6180002d3508, 
          online=online@entry=true, index=index@entry=0x619001598a78, fts_sort_idx=<optimized out>, psort_info=<optimized out>, files=<optimized out>, key_numbers=<optimized out>, n_index=<optimized out>, 
          defaults=<optimized out>, add_v=<optimized out>, col_map=<optimized out>, add_autoinc=<optimized out>, sequence=..., block=<optimized out>, skip_pk_sort=<optimized out>, tmpfd=<optimized out>, stage=<optimized out>, 
          pct_cost=pct_cost@entry=33.333333333333336, crypt_block=<optimized out>, eval_table=<optimized out>, allow_not_null=<optimized out>, col_collate=<optimized out>)
          at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:82
      #20 0x000055d694c55be9 in row_merge_build_indexes (trx=<optimized out>, old_table=<optimized out>, new_table=new_table@entry=0x6180002d3508, online=<optimized out>, indexes=0x619001598a78, key_numbers=<optimized out>, 
          n_indexes=<optimized out>, table=<optimized out>, defaults=<optimized out>, col_map=<optimized out>, add_autoinc=<optimized out>, sequence=..., skip_pk_sort=<optimized out>, stage=<optimized out>, 
          add_v=<optimized out>, eval_table=<optimized out>, allow_not_null=<optimized out>, col_collate=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/row/row0merge.cc:4584
      #21 0x000055d69491de08 in ha_innobase::inplace_alter_table (this=0x61d000535cb8, altered_table=<optimized out>, ha_alter_info=0x48ab569f1c70) at /usr/include/c++/9/bits/stl_tree.h:1038
      #22 0x000055d693895ca4 in handler::ha_inplace_alter_table (ha_alter_info=0x48ab569f1c70, altered_table=0x48ab569f2370, this=<optimized out>) at /data/Server/10.6_with_patch/sql/handler.h:4683
      #23 mysql_inplace_alter_table (thd=thd@entry=0x62b00011f218, table_list=<optimized out>, table=table@entry=0x61900159aa98, altered_table=<optimized out>, ha_alter_info=0x48ab569f1c70, target_mdl_request=<optimized out>, 
          ddl_log_state=<optimized out>, trigger_param=<optimized out>, alter_ctx=<optimized out>) at /data/Server/10.6_with_patch/sql/sql_table.cc:7423
      #24 0x000055d6938cd25c in mysql_alter_table (thd=thd@entry=0x62b00011f218, new_db=new_db@entry=0x62b000123c48, new_name=new_name@entry=0x62b0001240a0, create_info=0x48ab569f4720, table_list=<optimized out>, 
          table_list@entry=0x62b000181390, recreate_info=<optimized out>, alter_info=<optimized out>, order_num=<optimized out>, order=<optimized out>, ignore=<optimized out>, if_exists=<optimized out>)
          at /data/Server/10.6_with_patch/sql/sql_table.cc:10354
      #25 0x000055d693a58daa in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x62b00011f218) at /data/Server/10.6_with_patch/sql/structs.h:568
      #26 0x000055d6936139d9 in mysql_execute_command (thd=thd@entry=0x62b00011f218, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /data/Server/10.6_with_patch/sql/sql_parse.cc:6014
      #27 0x000055d6935c9ae4 in mysql_parse (thd=thd@entry=0x62b00011f218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x48ab569f6130) at /data/Server/10.6_with_patch/sql/sql_parse.cc:8036
      #28 0x000055d6935f8f2a in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b00011f218, packet=packet@entry=0x6290005be219 "ALTER TABLE `t7` FORCE /* E_R Thread2 QNO 9 CON_ID 16 */ ", 
          packet_length=packet_length@entry=57, blocking=blocking@entry=true) at /data/Server/10.6_with_patch/sql/sql_class.h:1385
      #29 0x000055d6935ffd50 in do_command (thd=0x62b00011f218, blocking=blocking@entry=true) at /data/Server/10.6_with_patch/sql/sql_parse.cc:1409
      #30 0x000055d693a41fc3 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x608000002eb8, put_in_cache=put_in_cache@entry=true) at /data/Server/10.6_with_patch/sql/sql_connect.cc:1416
      #31 0x000055d693a42b0c in handle_one_connection (arg=0x608000002eb8) at /data/Server/10.6_with_patch/sql/sql_connect.cc:1318
      #32 0x000039254d707609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #33 0x000055d69745d293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      (rr)
       
      pluto:/data/results/1685024739/TBR-1868$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio
       
      RQG
      ===
      # git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG
      #
      # GIT_SHOW: HEAD -> master, origin/master, origin/HEAD 9b710460ff8c54260b0e0649f029efa932aee22c 2023-05-23T14:33:37+02:00
      # rqg.pl  : Version 4.2.1 (2022-12)
      #
      # $RQG_HOME/rqg.pl \
      # --duration=300 \
      # --queries=10000000 \
      # --no_mask \
      # --seed=random \
      # --rpl_mode=none \
      # --engine=InnoDB \
      # --gendata=conf/mariadb/innodb_compression_encryption.zz \
      # --max_gd_duration=1200 \
      # --mysqld=--interactive_timeout=28800 \
      # --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      # --mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
      # --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      # --mysqld=--net_read_timeout=30 \
      # --mysqld=--innodb-open-files=10 \   # <== Important
      # --mysqld=--net_write_timeout=60 \
      # --mysqld=--innodb-encryption-threads=1 \
      # --mysqld=--wait_timeout=28800 \
      # --mysqld=--plugin-load-add=file_key_management.so \
      # --mysqld=--slave_net_timeout=60 \
      # --mysqld=--loose-max-statement-time=30 \
      # --mysqld=--loose-idle_transaction_timeout=0 \
      # --mysqld=--log_bin_trust_function_creators=1 \
      # --mysqld=--innodb_page_size=4K \
      # --mysqld=--connect_timeout=60 \
      # --mysqld=--loose-idle_write_transaction_timeout=0 \
      # --mysqld=--log-output=none \
      # --mysqld=--loose-table_lock_wait_timeout=50 \
      # --mysqld=--file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
      # --mysqld=--lock-wait-timeout=86400 \
      # --mysqld=--innodb-lock-wait-timeout=50 \
      # --mysqld=--sync-binlog=1 \
      # --mysqld=--loose-innodb_read_only_compressed=OFF \
      # --mysqld=--innodb_undo_tablespaces=16 \
      # --mysqld=--loose-plugin-load-add=provider_lz4.so \
      # --mysqld=--innodb-buffer-pool-size=5M \
      # --mysqld=--log-bin \
      # --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      # --reporters=Backtrace,Deadlock,None \
      # --validators=None \
      # --grammar=TBR-1868.yy \
      # <local settings>
      

      Attachments

        1. TBR-1868.cfg
          44 kB
          Matthias Leich
        2. TBR-1868.yy
          0.2 kB
          Matthias Leich

        Issue Links

          Activity

            Other RQG test runs based on TBR-1868.cfg and TBR-1868.yy showed in addition
             
            pluto:/data/results/1685024739/TBR-1877$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio
            [rr 1297319 1139214]2023-05-25 15:17:08 16 [Note] InnoDB: Cannot close file ./test/#sql-alter-13cba7-10.ibd because of 1 pending operations
            [rr 1297319 1139281]2023-05-25 15:17:08 16 [ERROR] InnoDB: preallocating 20480 bytes for file ./test/#sql-alter-13cba7-10.ibd failed with error 9
            [rr 1297319 1139304]2023-05-25 15:17:08 15 [Warning] InnoDB: innodb_open_files=10 is exceeded (18 files stay open)
            [rr 1297319 1139334]2023-05-25 15:17:08 0x232d56ad2700[rr 1297319 1139337]  InnoDB: Assertion failure in file /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc line 2967
            [rr 1297319 1139339]InnoDB: Failing assertion: fil_system.n_open == n_open
             
            (rr) bt
            #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
            #1  0x000041c33a6cb859 in __GI_abort () at abort.c:79
            #2  0x000055e5374b2853 in ut_dbg_assertion_failed (expr=expr@entry=0x55e5383fa240 "fil_system.n_open == n_open", file=file@entry=0x55e5383fd780 "/data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc", 
                line=line@entry=2967) at /data/Server/10.6_with_patch/storage/innobase/ut/ut0dbg.cc:60
            #3  0x000055e537721b6e in fil_validate () at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:2967
            #4  0x000055e537721c79 in fil_validate_skip () at /usr/include/c++/9/bits/atomic_base.h:539
            #5  0x000055e537722f42 in fil_space_t::io (this=this@entry=0x6120000a7a40, type=..., offset=<optimized out>, len=len@entry=4096, buf=<optimized out>, bpage=bpage@entry=0x6400000655b0)
                at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:2817
            #6  0x000055e537617b1b in buf_read_page_low (space=<optimized out>, sync=sync@entry=true, mode=mode@entry=132, page_id=..., zip_size=zip_size@entry=0, unzip=unzip@entry=false)
                at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:215
            #7  0x000055e53761e09a in buf_read_page (page_id=..., zip_size=zip_size@entry=0) at /data/Server/10.6_with_patch/storage/innobase/buf/buf0rea.cc:452
            #8  0x000055e5375b86c4 in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x232d56ac9f80, err=<optimized out>, 
                allow_ibuf_merge=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/buf/buf0buf.cc:2518
            #9  0x000055e5375bbf4e in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x232d56ac9f80, err=0x232d56ac9ba0, 
                allow_ibuf_merge=false) at /data/Server/10.6_with_patch/storage/innobase/buf/buf0buf.cc:2927
            #10 0x000055e5372b7de1 in row_merge_read_clustered_index (trx=trx@entry=0x154f05dc1940, table=table@entry=0x232d56acc370, old_table=old_table@entry=0x618000070508, new_table=new_table@entry=0x6180003af908, 
                online=online@entry=true, index=index@entry=0x619001447c78, fts_sort_idx=<optimized out>, psort_info=<optimized out>, files=<optimized out>, key_numbers=<optimized out>, n_index=<optimized out>, 
                defaults=<optimized out>, add_v=<optimized out>, col_map=<optimized out>, add_autoinc=<optimized out>, sequence=..., block=<optimized out>, skip_pk_sort=<optimized out>, tmpfd=<optimized out>, stage=<optimized out>, 
                pct_cost=pct_cost@entry=33.333333333333336, crypt_block=<optimized out>, eval_table=<optimized out>, allow_not_null=<optimized out>, col_collate=<optimized out>)
                at /data/Server/10.6_with_patch/storage/innobase/include/buf0types.h:95
            #11 0x000055e5372bfbe9 in row_merge_build_indexes (trx=<optimized out>, old_table=<optimized out>, new_table=new_table@entry=0x6180003af908, online=<optimized out>, indexes=0x619001447c78, key_numbers=<optimized out>, 
                n_indexes=<optimized out>, table=<optimized out>, defaults=<optimized out>, col_map=<optimized out>, add_autoinc=<optimized out>, sequence=..., skip_pk_sort=<optimized out>, stage=<optimized out>, 
                add_v=<optimized out>, eval_table=<optimized out>, allow_not_null=<optimized out>, col_collate=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/row/row0merge.cc:4584
            #12 0x000055e536f87e08 in ha_innobase::inplace_alter_table (this=0x61d00012cab8, altered_table=<optimized out>, ha_alter_info=0x232d56acbc70) at /usr/include/c++/9/bits/stl_tree.h:1038
            #13 0x000055e535effca4 in handler::ha_inplace_alter_table (ha_alter_info=0x232d56acbc70, altered_table=0x232d56acc370, this=<optimized out>) at /data/Server/10.6_with_patch/sql/handler.h:4683
            #14 mysql_inplace_alter_table (thd=thd@entry=0x62b0000cb218, table_list=<optimized out>, table=table@entry=0x619002535098, altered_table=<optimized out>, ha_alter_info=0x232d56acbc70, target_mdl_request=<optimized out>, 
                ddl_log_state=<optimized out>, trigger_param=<optimized out>, alter_ctx=<optimized out>) at /data/Server/10.6_with_patch/sql/sql_table.cc:7423
            #15 0x000055e535f3725c in mysql_alter_table (thd=thd@entry=0x62b0000cb218, new_db=new_db@entry=0x62b0000cfc48, new_name=new_name@entry=0x62b0000d00a0, create_info=0x232d56ace720, table_list=<optimized out>, 
                table_list@entry=0x62b0000d2390, recreate_info=<optimized out>, alter_info=<optimized out>, order_num=<optimized out>, order=<optimized out>, ignore=<optimized out>, if_exists=<optimized out>)
                at /data/Server/10.6_with_patch/sql/sql_table.cc:10354
            #16 0x000055e5360c2daa in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x62b0000cb218) at /data/Server/10.6_with_patch/sql/structs.h:568
            #17 0x000055e535c7d9d9 in mysql_execute_command (thd=thd@entry=0x62b0000cb218, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /data/Server/10.6_with_patch/sql/sql_parse.cc:6014
            #18 0x000055e535c33ae4 in mysql_parse (thd=thd@entry=0x62b0000cb218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x232d56ad0130) at /data/Server/10.6_with_patch/sql/sql_parse.cc:8036
            #19 0x000055e535c62f2a in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0000cb218, packet=packet@entry=0x6290002cb219 "ALTER TABLE `t8` FORCE /* E_R Thread1 QNO 21 CON_ID 15 */ ", 
                packet_length=packet_length@entry=58, blocking=blocking@entry=true) at /data/Server/10.6_with_patch/sql/sql_class.h:1385
            #20 0x000055e535c69d50 in do_command (thd=0x62b0000cb218, blocking=blocking@entry=true) at /data/Server/10.6_with_patch/sql/sql_parse.cc:1409
            #21 0x000055e5360abfc3 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x608000002e38, put_in_cache=put_in_cache@entry=true) at /data/Server/10.6_with_patch/sql/sql_connect.cc:1416
            #22 0x000055e5360acb0c in handle_one_connection (arg=0x608000002e38) at /data/Server/10.6_with_patch/sql/sql_connect.cc:1318
            #23 0x000049145c4d7609 in start_thread (arg=<optimized out>) at pthread_create.c:477
            #24 0x000041c33a7c8293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
            (rr) 
             
            [rr 1169223 959355]2023-05-25 15:01:27 0x640000361700[rr 1169223 959358]  InnoDB: Assertion failure in file /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc line 607
            [rr 1169223 959360]InnoDB: Failing assertion: fsize != os_offset_t(-1)
            and
            [rr 1169223 959396]2023-05-25 15:01:27 0x1df07a431700[rr 1169223 959399]  InnoDB: Assertion failure in file /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc line 2967
            [rr 1169223 959401]InnoDB: Failing assertion: fil_system.n_open == n_open
             
            pluto:/data/results/1685024739/TBR-1877--TBR-1870$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio
             
            It might be that the current bug is a sibling of MDEV-31156 even though my server setups differs.
            

            mleich Matthias Leich added a comment - Other RQG test runs based on TBR-1868.cfg and TBR-1868.yy showed in addition   pluto:/data/results/1685024739/TBR-1877$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio [rr 1297319 1139214]2023-05-25 15:17:08 16 [Note] InnoDB: Cannot close file ./test/#sql-alter-13cba7-10.ibd because of 1 pending operations [rr 1297319 1139281]2023-05-25 15:17:08 16 [ERROR] InnoDB: preallocating 20480 bytes for file ./test/#sql-alter-13cba7-10.ibd failed with error 9 [rr 1297319 1139304]2023-05-25 15:17:08 15 [Warning] InnoDB: innodb_open_files=10 is exceeded (18 files stay open) [rr 1297319 1139334]2023-05-25 15:17:08 0x232d56ad2700[rr 1297319 1139337] InnoDB: Assertion failure in file /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc line 2967 [rr 1297319 1139339]InnoDB: Failing assertion: fil_system.n_open == n_open   (rr) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x000041c33a6cb859 in __GI_abort () at abort.c:79 #2 0x000055e5374b2853 in ut_dbg_assertion_failed (expr=expr@entry=0x55e5383fa240 "fil_system.n_open == n_open", file=file@entry=0x55e5383fd780 "/data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc", line=line@entry=2967) at /data/Server/10.6_with_patch/storage/innobase/ut/ut0dbg.cc:60 #3 0x000055e537721b6e in fil_validate () at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:2967 #4 0x000055e537721c79 in fil_validate_skip () at /usr/include/c++/9/bits/atomic_base.h:539 #5 0x000055e537722f42 in fil_space_t::io (this=this@entry=0x6120000a7a40, type=..., offset=<optimized out>, len=len@entry=4096, buf=<optimized out>, bpage=bpage@entry=0x6400000655b0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:2817 #6 0x000055e537617b1b in buf_read_page_low (space=<optimized out>, sync=sync@entry=true, mode=mode@entry=132, page_id=..., zip_size=zip_size@entry=0, unzip=unzip@entry=false) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:215 #7 0x000055e53761e09a in buf_read_page (page_id=..., zip_size=zip_size@entry=0) at /data/Server/10.6_with_patch/storage/innobase/buf/buf0rea.cc:452 #8 0x000055e5375b86c4 in buf_page_get_low (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x232d56ac9f80, err=<optimized out>, allow_ibuf_merge=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/buf/buf0buf.cc:2518 #9 0x000055e5375bbf4e in buf_page_get_gen (page_id=..., zip_size=zip_size@entry=0, rw_latch=rw_latch@entry=1, guess=guess@entry=0x0, mode=mode@entry=10, mtr=mtr@entry=0x232d56ac9f80, err=0x232d56ac9ba0, allow_ibuf_merge=false) at /data/Server/10.6_with_patch/storage/innobase/buf/buf0buf.cc:2927 #10 0x000055e5372b7de1 in row_merge_read_clustered_index (trx=trx@entry=0x154f05dc1940, table=table@entry=0x232d56acc370, old_table=old_table@entry=0x618000070508, new_table=new_table@entry=0x6180003af908, online=online@entry=true, index=index@entry=0x619001447c78, fts_sort_idx=<optimized out>, psort_info=<optimized out>, files=<optimized out>, key_numbers=<optimized out>, n_index=<optimized out>, defaults=<optimized out>, add_v=<optimized out>, col_map=<optimized out>, add_autoinc=<optimized out>, sequence=..., block=<optimized out>, skip_pk_sort=<optimized out>, tmpfd=<optimized out>, stage=<optimized out>, pct_cost=pct_cost@entry=33.333333333333336, crypt_block=<optimized out>, eval_table=<optimized out>, allow_not_null=<optimized out>, col_collate=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/include/buf0types.h:95 #11 0x000055e5372bfbe9 in row_merge_build_indexes (trx=<optimized out>, old_table=<optimized out>, new_table=new_table@entry=0x6180003af908, online=<optimized out>, indexes=0x619001447c78, key_numbers=<optimized out>, n_indexes=<optimized out>, table=<optimized out>, defaults=<optimized out>, col_map=<optimized out>, add_autoinc=<optimized out>, sequence=..., skip_pk_sort=<optimized out>, stage=<optimized out>, add_v=<optimized out>, eval_table=<optimized out>, allow_not_null=<optimized out>, col_collate=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/row/row0merge.cc:4584 #12 0x000055e536f87e08 in ha_innobase::inplace_alter_table (this=0x61d00012cab8, altered_table=<optimized out>, ha_alter_info=0x232d56acbc70) at /usr/include/c++/9/bits/stl_tree.h:1038 #13 0x000055e535effca4 in handler::ha_inplace_alter_table (ha_alter_info=0x232d56acbc70, altered_table=0x232d56acc370, this=<optimized out>) at /data/Server/10.6_with_patch/sql/handler.h:4683 #14 mysql_inplace_alter_table (thd=thd@entry=0x62b0000cb218, table_list=<optimized out>, table=table@entry=0x619002535098, altered_table=<optimized out>, ha_alter_info=0x232d56acbc70, target_mdl_request=<optimized out>, ddl_log_state=<optimized out>, trigger_param=<optimized out>, alter_ctx=<optimized out>) at /data/Server/10.6_with_patch/sql/sql_table.cc:7423 #15 0x000055e535f3725c in mysql_alter_table (thd=thd@entry=0x62b0000cb218, new_db=new_db@entry=0x62b0000cfc48, new_name=new_name@entry=0x62b0000d00a0, create_info=0x232d56ace720, table_list=<optimized out>, table_list@entry=0x62b0000d2390, recreate_info=<optimized out>, alter_info=<optimized out>, order_num=<optimized out>, order=<optimized out>, ignore=<optimized out>, if_exists=<optimized out>) at /data/Server/10.6_with_patch/sql/sql_table.cc:10354 #16 0x000055e5360c2daa in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x62b0000cb218) at /data/Server/10.6_with_patch/sql/structs.h:568 #17 0x000055e535c7d9d9 in mysql_execute_command (thd=thd@entry=0x62b0000cb218, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /data/Server/10.6_with_patch/sql/sql_parse.cc:6014 #18 0x000055e535c33ae4 in mysql_parse (thd=thd@entry=0x62b0000cb218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x232d56ad0130) at /data/Server/10.6_with_patch/sql/sql_parse.cc:8036 #19 0x000055e535c62f2a in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0000cb218, packet=packet@entry=0x6290002cb219 "ALTER TABLE `t8` FORCE /* E_R Thread1 QNO 21 CON_ID 15 */ ", packet_length=packet_length@entry=58, blocking=blocking@entry=true) at /data/Server/10.6_with_patch/sql/sql_class.h:1385 #20 0x000055e535c69d50 in do_command (thd=0x62b0000cb218, blocking=blocking@entry=true) at /data/Server/10.6_with_patch/sql/sql_parse.cc:1409 #21 0x000055e5360abfc3 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x608000002e38, put_in_cache=put_in_cache@entry=true) at /data/Server/10.6_with_patch/sql/sql_connect.cc:1416 #22 0x000055e5360acb0c in handle_one_connection (arg=0x608000002e38) at /data/Server/10.6_with_patch/sql/sql_connect.cc:1318 #23 0x000049145c4d7609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #24 0x000041c33a7c8293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (rr)   [rr 1169223 959355]2023-05-25 15:01:27 0x640000361700[rr 1169223 959358] InnoDB: Assertion failure in file /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc line 607 [rr 1169223 959360]InnoDB: Failing assertion: fsize != os_offset_t(-1) and [rr 1169223 959396]2023-05-25 15:01:27 0x1df07a431700[rr 1169223 959399] InnoDB: Assertion failure in file /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc line 2967 [rr 1169223 959401]InnoDB: Failing assertion: fil_system.n_open == n_open   pluto:/data/results/1685024739/TBR-1877--TBR-1870$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio   It might be that the current bug is a sibling of MDEV-31156 even though my server setups differs.

            In the first trace, there seem to be 2 InnoDB fil_space_t objects that are pointing to the same file handle 93. One thread is in the process of creating and opening a file:

            10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf

            (rr) when
            Current event: 1161910
            (rr) thread apply 2 backtrace
            Thread 2 (Thread 1231496.1232721):
            …
            #12 0x000055d6950c12d9 in inline_mysql_mutex_unlock (src_line=2004, src_file=0x55d695d93780 "/data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc", that=0x55d6972f5cc8 <fil_system+40>) at /data/Server/10.6_with_patch/include/mysql/psi/mysql_thread.h:790
            #13 fil_ibd_create (space_id=<optimized out>, name=..., path=<optimized out>, path@entry=0x6030000142f0 "./test/#sql-alter-12ca88-e.ibd", flags=<optimized out>, size=size@entry=4, mode=mode@entry=FIL_ENCRYPTION_DEFAULT, key_id=<optimized out>, err=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:2004
            …
            (rr) thread apply 45 backtrace
            Thread 45 (Thread 1231496.1360811):
            #0  0x000055d6950a288a in fil_node_t::close (this=this@entry=0x6060001c9d60) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:99
            #1  0x000055d6950aac8a in fil_space_t::try_to_close (print_info=print_info@entry=true) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:129
            #2  0x000055d6950b1484 in fil_node_open_file (node=node@entry=0x6060000fea80) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:427
            #3  0x000055d6950b1acd in fil_space_t::prepare_acquired (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101
            #4  0x000055d6950b1e68 in fil_space_t::acquire_and_prepare (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:711
            #5  0x000055d6950b2242 in fil_space_t::acquire (this=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:722
            #6  fil_space_extend (space=space@entry=0x6120000448c0, size=size@entry=7) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:724
            #7  0x000055d6950edda2 in fsp_try_extend_data_file_with_pages (space=space@entry=0x6120000448c0, page_no=page_no@entry=6, header=header@entry=0x13de2229fdc0, mtr=mtr@entry=0x48ab569eed90) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:606
            #8  0x000055d6950ee372 in fsp_reserve_free_pages (space=space@entry=0x6120000448c0, header=header@entry=0x13de2229fdc0, size=size@entry=5, mtr=mtr@entry=0x48ab569eed90, n_pages=n_pages@entry=2) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:2337
            

            Curiously, the two different files are being assigned an identical temporary name during the execution of the two concurrently run DDL statements:

            ALTER TABLE `t7` FORCE /* E_R Thread2 QNO 9 CON_ID 16 */ ;
            ALTER TABLE `t1` FORCE /* E_R Thread1 QNO 14 CON_ID 14 */ ;
            

            That is why the two fil_node_t::handle for the two distinct tablespaces and tables will be identical. This should lead to all thinkable sorts of corruption, including the long-time mystery MDEV-12434.

            marko Marko Mäkelä added a comment - In the first trace, there seem to be 2 InnoDB fil_space_t objects that are pointing to the same file handle 93. One thread is in the process of creating and opening a file: 10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf (rr) when Current event: 1161910 (rr) thread apply 2 backtrace Thread 2 (Thread 1231496.1232721): … #12 0x000055d6950c12d9 in inline_mysql_mutex_unlock (src_line=2004, src_file=0x55d695d93780 "/data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc", that=0x55d6972f5cc8 <fil_system+40>) at /data/Server/10.6_with_patch/include/mysql/psi/mysql_thread.h:790 #13 fil_ibd_create (space_id=<optimized out>, name=..., path=<optimized out>, path@entry=0x6030000142f0 "./test/#sql-alter-12ca88-e.ibd", flags=<optimized out>, size=size@entry=4, mode=mode@entry=FIL_ENCRYPTION_DEFAULT, key_id=<optimized out>, err=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:2004 … (rr) thread apply 45 backtrace Thread 45 (Thread 1231496.1360811): #0 0x000055d6950a288a in fil_node_t::close (this=this@entry=0x6060001c9d60) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:99 #1 0x000055d6950aac8a in fil_space_t::try_to_close (print_info=print_info@entry=true) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:129 #2 0x000055d6950b1484 in fil_node_open_file (node=node@entry=0x6060000fea80) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:427 #3 0x000055d6950b1acd in fil_space_t::prepare_acquired (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101 #4 0x000055d6950b1e68 in fil_space_t::acquire_and_prepare (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:711 #5 0x000055d6950b2242 in fil_space_t::acquire (this=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:722 #6 fil_space_extend (space=space@entry=0x6120000448c0, size=size@entry=7) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:724 #7 0x000055d6950edda2 in fsp_try_extend_data_file_with_pages (space=space@entry=0x6120000448c0, page_no=page_no@entry=6, header=header@entry=0x13de2229fdc0, mtr=mtr@entry=0x48ab569eed90) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:606 #8 0x000055d6950ee372 in fsp_reserve_free_pages (space=space@entry=0x6120000448c0, header=header@entry=0x13de2229fdc0, size=size@entry=5, mtr=mtr@entry=0x48ab569eed90, n_pages=n_pages@entry=2) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:2337 Curiously, the two different files are being assigned an identical temporary name during the execution of the two concurrently run DDL statements: ALTER TABLE `t7` FORCE /* E_R Thread2 QNO 9 CON_ID 16 */ ; ALTER TABLE `t1` FORCE /* E_R Thread1 QNO 14 CON_ID 14 */ ; That is why the two fil_node_t::handle for the two distinct tablespaces and tables will be identical. This should lead to all thinkable sorts of corruption, including the long-time mystery MDEV-12434 .

            Thread 45, which will report the error, originally created the name #sql-alter-12ca88-10, based on its THD::thread_id value of 16, for the DDL statement on table t7:

            10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf

            Thread 45 hit Breakpoint 13, Alter_table_ctx::Alter_table_ctx (this=0x48ab569f3280, thd=0x62b00011f218, table_list=0x62b000181390, tables_opened_arg=<optimized out>, new_db_arg=<optimized out>, new_name_arg=<optimized out>) at /data/Server/10.6_with_patch/sql/sql_alter.cc:324
            324	  tmp_name.length= my_snprintf(tmp_name_buff, sizeof(tmp_name_buff),
            

            After this, Thread 2 will create the similar temporary name that occurs in the fil_ibd_create() call mentioned in my previous comment, carrying the THD::thread_id value of 14. During the time Thread 2 is creating the name, Thread 45 will be waiting here:

            10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf

            #8  0x000055d6974533f1 in fdatasync (fd=fd@entry=31) at ../sysdeps/unix/sysv/linux/fdatasync.c:28
            #9  0x000055d6953336a6 in my_sync (fd=31, my_flags=my_flags@entry=16) at /data/Server/10.6_with_patch/mysys/my_sync.c:92
            #10 0x000055d693911052 in inline_mysql_file_sync (flags=16, fd=<optimized out>) at /data/Server/10.6_with_patch/include/mysql/psi/mysql_file.h:1448
            #11 ddl_log_sync_file () at /data/Server/10.6_with_patch/sql/ddl_log.cc:193
            #12 0x000055d693911090 in ddl_log_sync_no_lock () at /data/Server/10.6_with_patch/sql/ddl_log.cc:203
            #13 0x000055d69391c512 in ddl_log_write_execute_entry (first_entry=<optimized out>, active_entry=0x48ab569f1978) at /data/Server/10.6_with_patch/sql/ddl_log.cc:2557
            #14 0x000055d6939215dd in ddl_log_store_query (thd=thd@entry=0x62b00011f218, ddl_state=ddl_state@entry=0x48ab569f1970, query=0x62b000181270 "", length=0) at /data/Server/10.6_with_patch/sql/ddl_log.cc:3513
            #15 0x000055d6938cb97e in mysql_alter_table (thd=thd@entry=0x62b00011f218, new_db=new_db@entry=0x62b000123c48, new_name=new_name@entry=0x62b0001240a0, create_info=0x48ab569f4720, table_list=<optimized out>, table_list@entry=0x62b000181390, recreate_info=<optimized out>, alter_info=<optimized out>, order_num=<optimized out>, order=<optimized out>, ignore=<optimized out>, if_exists=<optimized out>) at /data/Server/10.6_with_patch/sql/sql_class.h:1385
            

            A little after this, Thread 45 will invoke ha_innobase::prepare_inplace_alter_table(), still with the correct name of the intermediate table. It will close the file test/t4.ibd, which was file descriptor 93, and open test/t7.ibd as file descriptor 93. It creates the intermediate file #sql-alter-12ca88-10.ibd as file descriptor 95, and closes test/t7.ibd, which no longer is known as 93.

            After all this, Thread 2 finally enters ha_innobase::prepare_inplace_alter_table() and opens test/t1.ibd as file descriptor 56. Next, Thread 45 will open test/t7.ibd as file descriptor 56:

            10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf

            #9  0x000039254d712ada in __libc_open64 (file=0x6020001a44b0 "./test/t7.ibd", oflag=oflag@entry=524290) at ../sysdeps/unix/sysv/linux/open64.c:48
            #10 0x000055d694b0153d in os_file_create_func (name=<optimized out>, create_mode=<optimized out>, purpose=purpose@entry=61, type=type@entry=100, read_only=<optimized out>, success=success@entry=0x48ab569eed60) at /data/Server/10.6_with_patch/storage/innobase/os/os0file.cc:1204
            #11 0x000055d6950afeab in fil_node_open_file_low (node=node@entry=0x6060000bb160) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:375
            #12 0x000055d6950b1691 in fil_node_open_file (node=node@entry=0x6060000bb160) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101
            #13 0x000055d6950b1acd in fil_space_t::prepare_acquired (this=this@entry=0x6120000c9340) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101
            #14 0x000055d6950b2550 in fil_space_t::get (id=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:1496
            #15 0x000055d694fb4045 in buf_read_page (page_id=..., zip_size=zip_size@entry=0) at /data/Server/10.6_with_patch/storage/innobase/include/buf0types.h:131
            

            Concurrently, Thread 2 is creating the intermediate file #sql-alter-12ca88-e.ibd as file descriptor 93:

            10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf

            #0  os_file_create_func (name=0x6030000142f0 "./test/#sql-alter-12ca88-e.ibd", create_mode=create_mode@entry=180, purpose=purpose@entry=61, type=100, read_only=false, success=success@entry=0x7f1c0e285840) at /data/Server/10.6_with_patch/storage/innobase/os/os0file.cc:1126
            #1  0x000055d6950c0748 in fil_ibd_create (space_id=<optimized out>, name=..., path=<optimized out>, path@entry=0x6030000142f0 "./test/#sql-alter-12ca88-e.ibd", flags=<optimized out>, size=size@entry=4, mode=mode@entry=FIL_ENCRYPTION_DEFAULT, key_id=<optimized out>, err=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:1921
            #2  0x000055d694fce561 in dict_create_index_space (node=...) at /data/Server/10.6_with_patch/storage/innobase/include/dict0dict.inl:670
            

            When Thread 2 finishes creating the file, Thread 45 is busy trying to extend its intermediate file, on which it holds no file descriptor:

            10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf

            #2  0x000055d6950b13b0 in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/include/c++/9/thread:378
            #3  fil_node_open_file (node=node@entry=0x6060000fea80) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:440
            #4  0x000055d6950b1acd in fil_space_t::prepare_acquired (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101
            #5  0x000055d6950b1e68 in fil_space_t::acquire_and_prepare (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:711
            #6  0x000055d6950b2242 in fil_space_t::acquire (this=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:722
            #7  fil_space_extend (space=space@entry=0x6120000448c0, size=size@entry=7) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:724
            #8  0x000055d6950edda2 in fsp_try_extend_data_file_with_pages (space=space@entry=0x6120000448c0, page_no=page_no@entry=6, header=header@entry=0x13de2229fdc0, mtr=mtr@entry=0x48ab569eed90) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:606
            #9  0x000055d6950ee372 in fsp_reserve_free_pages (space=space@entry=0x6120000448c0, header=header@entry=0x13de2229fdc0, size=size@entry=5, mtr=mtr@entry=0x48ab569eed90, n_pages=n_pages@entry=2) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:2337
            #10 0x000055d6950ef0fd in fsp_reserve_free_extents (n_reserved=n_reserved@entry=0x48ab569eed70, space=0x6120000448c0, n_ext=n_ext@entry=1, alloc_type=alloc_type@entry=FSP_NORMAL, mtr=mtr@entry=0x48ab569eed90, n_pages=n_pages@entry=2) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:2414
            

            That is when Thread 45 chooses to close the file handle that Thread 2 is creating:

            10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf

            1999		mysql_mutex_lock(&fil_system.mutex);
            2000		if (fil_space_t* space = fil_space_t::create(space_id, flags,
            2001							     FIL_TYPE_TABLESPACE,
            2002							     crypt_data, mode, true)) {
            2003			fil_node_t* node = space->add(path, file, size, false, true);
            2004			mysql_mutex_unlock(&fil_system.mutex);
            2005			IF_WIN(node->find_metadata(), node->find_metadata(file, true));
            

            Clearly, Thread 2 is in the wrong here, assuming that file, which was attached to node->handle, is still open. As soon as fil_system.mutex is released, other threads may close the file.

            This would wrongly allow Thread 2 to share the file handle that Thread 45 created. If we assume that both ALTER TABLE statements complete successfully, ultimately both tables t1 and t7 will end up being corrupted. One would receive some but not all writes that were intended for t7, and the other would receive writes intended for both t1 and t7.

            As far as I can tell, this error was introduced in MDEV-17380 (MariaDB Server 10.4.4).

            marko Marko Mäkelä added a comment - Thread 45, which will report the error, originally created the name #sql-alter-12ca88-10 , based on its THD::thread_id value of 16, for the DDL statement on table t7 : 10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf Thread 45 hit Breakpoint 13, Alter_table_ctx::Alter_table_ctx (this=0x48ab569f3280, thd=0x62b00011f218, table_list=0x62b000181390, tables_opened_arg=<optimized out>, new_db_arg=<optimized out>, new_name_arg=<optimized out>) at /data/Server/10.6_with_patch/sql/sql_alter.cc:324 324 tmp_name.length= my_snprintf(tmp_name_buff, sizeof(tmp_name_buff), After this, Thread 2 will create the similar temporary name that occurs in the fil_ibd_create() call mentioned in my previous comment, carrying the THD::thread_id value of 14. During the time Thread 2 is creating the name, Thread 45 will be waiting here: 10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf #8 0x000055d6974533f1 in fdatasync (fd=fd@entry=31) at ../sysdeps/unix/sysv/linux/fdatasync.c:28 #9 0x000055d6953336a6 in my_sync (fd=31, my_flags=my_flags@entry=16) at /data/Server/10.6_with_patch/mysys/my_sync.c:92 #10 0x000055d693911052 in inline_mysql_file_sync (flags=16, fd=<optimized out>) at /data/Server/10.6_with_patch/include/mysql/psi/mysql_file.h:1448 #11 ddl_log_sync_file () at /data/Server/10.6_with_patch/sql/ddl_log.cc:193 #12 0x000055d693911090 in ddl_log_sync_no_lock () at /data/Server/10.6_with_patch/sql/ddl_log.cc:203 #13 0x000055d69391c512 in ddl_log_write_execute_entry (first_entry=<optimized out>, active_entry=0x48ab569f1978) at /data/Server/10.6_with_patch/sql/ddl_log.cc:2557 #14 0x000055d6939215dd in ddl_log_store_query (thd=thd@entry=0x62b00011f218, ddl_state=ddl_state@entry=0x48ab569f1970, query=0x62b000181270 "", length=0) at /data/Server/10.6_with_patch/sql/ddl_log.cc:3513 #15 0x000055d6938cb97e in mysql_alter_table (thd=thd@entry=0x62b00011f218, new_db=new_db@entry=0x62b000123c48, new_name=new_name@entry=0x62b0001240a0, create_info=0x48ab569f4720, table_list=<optimized out>, table_list@entry=0x62b000181390, recreate_info=<optimized out>, alter_info=<optimized out>, order_num=<optimized out>, order=<optimized out>, ignore=<optimized out>, if_exists=<optimized out>) at /data/Server/10.6_with_patch/sql/sql_class.h:1385 A little after this, Thread 45 will invoke ha_innobase::prepare_inplace_alter_table() , still with the correct name of the intermediate table. It will close the file test/t4.ibd , which was file descriptor 93, and open test/t7.ibd as file descriptor 93. It creates the intermediate file #sql-alter-12ca88-10.ibd as file descriptor 95, and closes test/t7.ibd , which no longer is known as 93. After all this, Thread 2 finally enters ha_innobase::prepare_inplace_alter_table() and opens test/t1.ibd as file descriptor 56. Next, Thread 45 will open test/t7.ibd as file descriptor 56: 10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf #9 0x000039254d712ada in __libc_open64 (file=0x6020001a44b0 "./test/t7.ibd", oflag=oflag@entry=524290) at ../sysdeps/unix/sysv/linux/open64.c:48 #10 0x000055d694b0153d in os_file_create_func (name=<optimized out>, create_mode=<optimized out>, purpose=purpose@entry=61, type=type@entry=100, read_only=<optimized out>, success=success@entry=0x48ab569eed60) at /data/Server/10.6_with_patch/storage/innobase/os/os0file.cc:1204 #11 0x000055d6950afeab in fil_node_open_file_low (node=node@entry=0x6060000bb160) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:375 #12 0x000055d6950b1691 in fil_node_open_file (node=node@entry=0x6060000bb160) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101 #13 0x000055d6950b1acd in fil_space_t::prepare_acquired (this=this@entry=0x6120000c9340) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101 #14 0x000055d6950b2550 in fil_space_t::get (id=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:1496 #15 0x000055d694fb4045 in buf_read_page (page_id=..., zip_size=zip_size@entry=0) at /data/Server/10.6_with_patch/storage/innobase/include/buf0types.h:131 Concurrently, Thread 2 is creating the intermediate file #sql-alter-12ca88-e.ibd as file descriptor 93: 10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf #0 os_file_create_func (name=0x6030000142f0 "./test/#sql-alter-12ca88-e.ibd", create_mode=create_mode@entry=180, purpose=purpose@entry=61, type=100, read_only=false, success=success@entry=0x7f1c0e285840) at /data/Server/10.6_with_patch/storage/innobase/os/os0file.cc:1126 #1 0x000055d6950c0748 in fil_ibd_create (space_id=<optimized out>, name=..., path=<optimized out>, path@entry=0x6030000142f0 "./test/#sql-alter-12ca88-e.ibd", flags=<optimized out>, size=size@entry=4, mode=mode@entry=FIL_ENCRYPTION_DEFAULT, key_id=<optimized out>, err=<optimized out>) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:1921 #2 0x000055d694fce561 in dict_create_index_space (node=...) at /data/Server/10.6_with_patch/storage/innobase/include/dict0dict.inl:670 When Thread 2 finishes creating the file, Thread 45 is busy trying to extend its intermediate file, on which it holds no file descriptor: 10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf #2 0x000055d6950b13b0 in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/include/c++/9/thread:378 #3 fil_node_open_file (node=node@entry=0x6060000fea80) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:440 #4 0x000055d6950b1acd in fil_space_t::prepare_acquired (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/include/os0file.h:101 #5 0x000055d6950b1e68 in fil_space_t::acquire_and_prepare (this=this@entry=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:711 #6 0x000055d6950b2242 in fil_space_t::acquire (this=0x6120000448c0) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:722 #7 fil_space_extend (space=space@entry=0x6120000448c0, size=size@entry=7) at /data/Server/10.6_with_patch/storage/innobase/fil/fil0fil.cc:724 #8 0x000055d6950edda2 in fsp_try_extend_data_file_with_pages (space=space@entry=0x6120000448c0, page_no=page_no@entry=6, header=header@entry=0x13de2229fdc0, mtr=mtr@entry=0x48ab569eed90) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:606 #9 0x000055d6950ee372 in fsp_reserve_free_pages (space=space@entry=0x6120000448c0, header=header@entry=0x13de2229fdc0, size=size@entry=5, mtr=mtr@entry=0x48ab569eed90, n_pages=n_pages@entry=2) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:2337 #10 0x000055d6950ef0fd in fsp_reserve_free_extents (n_reserved=n_reserved@entry=0x48ab569eed70, space=0x6120000448c0, n_ext=n_ext@entry=1, alloc_type=alloc_type@entry=FSP_NORMAL, mtr=mtr@entry=0x48ab569eed90, n_pages=n_pages@entry=2) at /data/Server/10.6_with_patch/storage/innobase/fsp/fsp0fsp.cc:2414 That is when Thread 45 chooses to close the file handle that Thread 2 is creating: 10.6 270eeeb52358db8b76745dfb5fc0e78118a6a2bf 1999 mysql_mutex_lock(&fil_system.mutex); 2000 if (fil_space_t* space = fil_space_t::create(space_id, flags, 2001 FIL_TYPE_TABLESPACE, 2002 crypt_data, mode, true)) { 2003 fil_node_t* node = space->add(path, file, size, false, true); 2004 mysql_mutex_unlock(&fil_system.mutex); 2005 IF_WIN(node->find_metadata(), node->find_metadata(file, true)); Clearly, Thread 2 is in the wrong here, assuming that file , which was attached to node->handle , is still open. As soon as fil_system.mutex is released, other threads may close the file. This would wrongly allow Thread 2 to share the file handle that Thread 45 created. If we assume that both ALTER TABLE statements complete successfully, ultimately both tables t1 and t7 will end up being corrupted. One would receive some but not all writes that were intended for t7 , and the other would receive writes intended for both t1 and t7 . As far as I can tell, this error was introduced in MDEV-17380 (MariaDB Server 10.4.4).

            I think that the patch for 10.6 and later (after MDEV-31080 cleaned up some of this) should be as simple as this:

            diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
            index 7bd813b0c0d..e08d8a25171 100644
            --- a/storage/innobase/fil/fil0fil.cc
            +++ b/storage/innobase/fil/fil0fil.cc
            @@ -2001,8 +2001,8 @@ fil_ibd_create(
             						     FIL_TYPE_TABLESPACE,
             						     crypt_data, mode, true)) {
             		fil_node_t* node = space->add(path, file, size, false, true);
            -		mysql_mutex_unlock(&fil_system.mutex);
             		IF_WIN(node->find_metadata(), node->find_metadata(file, true));
            +		mysql_mutex_unlock(&fil_system.mutex);
             		mtr.start();
             		mtr.set_named_space(space);
             		ut_a(fsp_header_init(space, size, &mtr) == DB_SUCCESS);
            

            It is worth noting that also the much more I/O intensive function fil_node_t::read_page0() is protected by fil_system.mutex.

            marko Marko Mäkelä added a comment - I think that the patch for 10.6 and later (after MDEV-31080 cleaned up some of this) should be as simple as this: diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc index 7bd813b0c0d..e08d8a25171 100644 --- a/storage/innobase/fil/fil0fil.cc +++ b/storage/innobase/fil/fil0fil.cc @@ -2001,8 +2001,8 @@ fil_ibd_create( FIL_TYPE_TABLESPACE, crypt_data, mode, true)) { fil_node_t* node = space->add(path, file, size, false, true); - mysql_mutex_unlock(&fil_system.mutex); IF_WIN(node->find_metadata(), node->find_metadata(file, true)); + mysql_mutex_unlock(&fil_system.mutex); mtr.start(); mtr.set_named_space(space); ut_a(fsp_header_init(space, size, &mtr) == DB_SUCCESS); It is worth noting that also the much more I/O intensive function fil_node_t::read_page0() is protected by fil_system.mutex .

            On a closer look, 10.4 is unaffected by this after all. The race condition was actually introduced in MDEV-23855 when fil_ibd_create() was changed so that it would pass the already created file handle to fil_space_t::add().

            marko Marko Mäkelä added a comment - On a closer look, 10.4 is unaffected by this after all. The race condition was actually introduced in MDEV-23855 when fil_ibd_create() was changed so that it would pass the already created file handle to fil_space_t::add() .

            People

              marko Marko Mäkelä
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.