Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26692

Assertion failures upon altering table used as backing for oqgraph

Details

    Description

      I'm not sure whose fault is this – InnoDB or OQGraph or server – but given that the failures only happen with InnoDB table as backing, not MyISAM, I'll start from InnoDB as a suspect

      Test case 1

      --source include/have_innodb.inc
       
      INSTALL SONAME 'ha_oqgraph';
       
      CREATE OR REPLACE TABLE oq_backing (
         origid INT UNSIGNED NOT NULL, 
         destid INT UNSIGNED NOT NULL,  
         PRIMARY KEY (origid, destid), 
         KEY (destid)
       ) ENGINE=InnoDB;
       
      INSERT INTO oq_backing(origid, destid) 
        VALUES (1,2), (2,3), (3,4), (4,5), (2,6), (5,6);
       
      CREATE OR REPLACE TABLE oq (
         latch VARCHAR(32) NULL,
         origid BIGINT UNSIGNED NULL,
         destid BIGINT UNSIGNED NULL,
         weight DOUBLE NULL,
         seq BIGINT UNSIGNED NULL,
         linkid BIGINT UNSIGNED NULL,
         KEY (latch, origid, destid) USING HASH,
         KEY (latch, destid, origid) USING HASH
       ) 
       ENGINE=OQGRAPH 
       data_table='oq_backing' origid='origid' destid='destid';
       
      SELECT * FROM oq;
       
      alter table oq_backing force;
       
      # Cleanup
      DROP TABLE oq, oq_backing;
      UNINSTALL SONAME 'ha_oqgraph';
      

      10.2 f59f5c4a

      mysqld: /data/src/10.2/storage/innobase/handler/handler0alter.cc:7660: bool commit_try_rebuild(Alter_inplace_info*, ha_innobase_inplace_ctx*, TABLE*, const TABLE*, trx_t*, const char*): Assertion `user_table->get_ref_count() == 1' failed.
      210927  0:45:04 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f3663c7df36 in __GI___assert_fail (assertion=0x5582542eb0e0 "user_table->get_ref_count() == 1", file=0x5582542e89a8 "/data/src/10.2/storage/innobase/handler/handler0alter.cc", line=7660, function=0x5582542ec960 "bool commit_try_rebuild(Alter_inplace_info*, ha_innobase_inplace_ctx*, TABLE*, const TABLE*, trx_t*, const char*)") at assert.c:101
      #8  0x0000558253c1c140 in commit_try_rebuild (ha_alter_info=0x7f365df95ff0, ctx=0x7f3608013e00, altered_table=0x7f360809c480, old_table=0x7f360809df00, trx=0x7f365eb69228, table_name=0x7f36080abb65 "oq_backing") at /data/src/10.2/storage/innobase/handler/handler0alter.cc:7660
      #9  0x0000558253c186ba in ha_innobase::commit_inplace_alter_table (this=0x7f3608039c48, altered_table=0x7f360809c480, ha_alter_info=0x7f365df95ff0, commit=true) at /data/src/10.2/storage/innobase/handler/handler0alter.cc:8508
      #10 0x00005582539d4f16 in handler::ha_commit_inplace_alter_table (this=0x7f3608039c48, altered_table=0x7f360809c480, ha_alter_info=0x7f365df95ff0, commit=true) at /data/src/10.2/sql/handler.cc:4386
      #11 0x000055825381b151 in mysql_inplace_alter_table (thd=0x7f3608000d90, table_list=0x7f36080127f8, table=0x7f360809df00, altered_table=0x7f360809c480, ha_alter_info=0x7f365df95ff0, inplace_supported=HA_ALTER_INPLACE_NO_LOCK_AFTER_PREPARE, target_mdl_request=0x7f365df96080, alter_ctx=0x7f365df96630) at /data/src/10.2/sql/sql_table.cc:7540
      #12 0x0000558253820ef9 in mysql_alter_table (thd=0x7f3608000d90, new_db=0x7f3608012e18 "test", new_name=0x0, create_info=0x7f365df97250, table_list=0x7f36080127f8, alter_info=0x7f365df971a0, order_num=0, order=0x0, ignore=false) at /data/src/10.2/sql/sql_table.cc:9688
      #13 0x000055825389fc9c in Sql_cmd_alter_table::execute (this=0x7f3608012e20, thd=0x7f3608000d90) at /data/src/10.2/sql/sql_alter.cc:333
      #14 0x000055825374cd0c in mysql_execute_command (thd=0x7f3608000d90) at /data/src/10.2/sql/sql_parse.cc:6017
      #15 0x0000558253751b72 in mysql_parse (thd=0x7f3608000d90, rawbuf=0x7f3608012708 "alter table oq_backing force", length=28, parser_state=0x7f365df98560, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:7793
      #16 0x000055825373fdcd in dispatch_command (command=COM_QUERY, thd=0x7f3608000d90, packet=0x7f3608008b61 "alter table oq_backing force", packet_length=28, is_com_multi=false, is_next_command=false) at /data/src/10.2/sql/sql_parse.cc:1827
      #17 0x000055825373e8c8 in do_command (thd=0x7f3608000d90) at /data/src/10.2/sql/sql_parse.cc:1381
      #18 0x000055825389a691 in do_handle_one_connection (connect=0x5582574897b0) at /data/src/10.2/sql/sql_connect.cc:1336
      #19 0x000055825389a3f6 in handle_one_connection (arg=0x5582574897b0) at /data/src/10.2/sql/sql_connect.cc:1241
      #20 0x00005582540c4c2a in pfs_spawn_thread (arg=0x55825752ca00) at /data/src/10.2/storage/perfschema/pfs.cc:1869
      #21 0x00007f366418e609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #22 0x00007f3663d69293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      The second test case differs from the first one only in the ALTER part.

      Test case 2

      --source include/have_innodb.inc
       
      INSTALL SONAME 'ha_oqgraph';
       
      CREATE OR REPLACE TABLE oq_backing (
         origid INT UNSIGNED NOT NULL, 
         destid INT UNSIGNED NOT NULL,  
         PRIMARY KEY (origid, destid), 
         KEY (destid)
       ) ENGINE=InnoDB;
       
      INSERT INTO oq_backing(origid, destid) 
        VALUES (1,2), (2,3), (3,4), (4,5), (2,6), (5,6);
       
      CREATE OR REPLACE TABLE oq (
         latch VARCHAR(32) NULL,
         origid BIGINT UNSIGNED NULL,
         destid BIGINT UNSIGNED NULL,
         weight DOUBLE NULL,
         seq BIGINT UNSIGNED NULL,
         linkid BIGINT UNSIGNED NULL,
         KEY (latch, origid, destid) USING HASH,
         KEY (latch, destid, origid) USING HASH
       ) 
       ENGINE=OQGRAPH 
       data_table='oq_backing' origid='origid' destid='destid';
       
      SELECT * FROM oq;
       
      alter table oq_backing modify origid TINYINT;
      alter table oq_backing modify origid INT UNSIGNED NOT NULL;
       
      # Cleanup
      DROP TABLE oq, oq_backing;
      UNINSTALL SONAME 'ha_oqgraph';
      

      10.2 f59f5c4a

      At line 31: query 'alter table oq_backing modify origid INT UNSIGNED NOT NULL' failed: 1050: Table './test/#sql2-3ac447-9' already exists
      

      mysqld: /data/src/10.3/storage/innobase/row/row0mysql.cc:4350: dberr_t row_rename_table_for_mysql(const char*, const char*, trx_t*, bool, bool): Assertion `err != DB_DUPLICATE_KEY' failed.
      210927  0:47:10 [ERROR] mysqld got signal 6 ;
       
      #7  0x00007f94f7db7f36 in __GI___assert_fail (assertion=0x559ec2c4f735 "err != DB_DUPLICATE_KEY", file=0x559ec2c4c9b8 "/data/src/10.3/storage/innobase/row/row0mysql.cc", line=4350, function=0x559ec2c4f5b8 "dberr_t row_rename_table_for_mysql(const char*, const char*, trx_t*, bool, bool)") at assert.c:101
      #8  0x0000559ec24ac332 in row_rename_table_for_mysql (old_name=0x7f94f0e23810 "test/oq_backing", new_name=0x7f94f0e23610 "test/#sql2-3ac4a3-9", trx=0x7f94f1927178, commit=true, use_fk=true) at /data/src/10.3/storage/innobase/row/row0mysql.cc:4350
      #9  0x0000559ec2329ba6 in innobase_rename_table (trx=0x7f94f1927178, from=0x7f94f0e247c0 "./test/oq_backing", to=0x7f94f0e249d0 "./test/#sql2-3ac4a3-9", commit=true) at /data/src/10.3/storage/innobase/handler/ha_innodb.cc:13454
      #10 0x0000559ec2311566 in ha_innobase::rename_table (this=0x7f949c0145f8, from=0x7f94f0e247c0 "./test/oq_backing", to=0x7f94f0e249d0 "./test/#sql2-3ac4a3-9") at /data/src/10.3/storage/innobase/handler/ha_innodb.cc:13649
      #11 0x0000559ec20d0ce5 in handler::ha_rename_table (this=0x7f949c0145f8, from=0x7f94f0e247c0 "./test/oq_backing", to=0x7f94f0e249d0 "./test/#sql2-3ac4a3-9") at /data/src/10.3/sql/handler.cc:4696
      #12 0x0000559ec1e7a6d5 in mysql_rename_table (base=0x559ec4ed8db0, old_db=0x7f94f0e258b0, old_name=0x7f94f0e258c0, new_db=0x7f94f0e258b0, new_name=0x7f94f0e25170, flags=2) at /data/src/10.3/sql/sql_table.cc:5611
      #13 0x0000559ec1e879b4 in mysql_alter_table (thd=0x7f949c000d90, new_db=0x7f949c005470, new_name=0x7f949c005858, create_info=0x7f94f0e26490, table_list=0x7f949c012c08, alter_info=0x7f94f0e263d0, order_num=0, order=0x0, ignore=false) at /data/src/10.3/sql/sql_table.cc:10266
      #14 0x0000559ec1f18230 in Sql_cmd_alter_table::execute (this=0x7f949c013388, thd=0x7f949c000d90) at /data/src/10.3/sql/sql_alter.cc:512
      #15 0x0000559ec1da1033 in mysql_execute_command (thd=0x7f949c000d90) at /data/src/10.3/sql/sql_parse.cc:6075
      #16 0x0000559ec1da66f2 in mysql_parse (thd=0x7f949c000d90, rawbuf=0x7f949c012ad8 "alter table oq_backing modify origid INT UNSIGNED NOT NULL", length=58, parser_state=0x7f94f0e27530, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:7870
      #17 0x0000559ec1d92db3 in dispatch_command (command=COM_QUERY, thd=0x7f949c000d90, packet=0x7f949c008f31 "", packet_length=58, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:1852
      #18 0x0000559ec1d91753 in do_command (thd=0x7f949c000d90) at /data/src/10.3/sql/sql_parse.cc:1398
      #19 0x0000559ec1f120c6 in do_handle_one_connection (connect=0x559ec51e5420) at /data/src/10.3/sql/sql_connect.cc:1403
      #20 0x0000559ec1f11e22 in handle_one_connection (arg=0x559ec51e5420) at /data/src/10.3/sql/sql_connect.cc:1308
      #21 0x0000559ec28e576d in pfs_spawn_thread (arg=0x559ec51c8340) at /data/src/10.3/storage/perfschema/pfs.cc:1869
      #22 0x00007f94f7f7c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #23 0x00007f94f7ea3293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      Attachments

        Activity

          marko Marko Mäkelä added a comment - - edited

          For the first failure, I performed some analysis with ./mtr --rr:

          10.6 76972163711f965402d51055f081ab51ae4a3bb7

          (rr) reverse-continue
          Continuing.
           
          Thread 2 hit Hardware watchpoint 1: -location table.n_ref_count
           
          Old value = {m_counter = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 2}, <No data fields>}}
          New value = {m_counter = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 1}, <No data fields>}}
          0x00005637db00b258 in std::__atomic_base<unsigned int>::fetch_add (__m=std::memory_order_relaxed, __i=1, this=0x7fc97c105140) at /usr/include/c++/9/bits/atomic_base.h:539
          539	      fetch_add(__int_type __i,
          (rr) bt
          #0  0x00005637db00b258 in std::__atomic_base<unsigned int>::fetch_add (__m=std::memory_order_relaxed, __i=1, this=0x7fc97c105140) at /usr/include/c++/9/bits/atomic_base.h:539
          #1  Atomic_counter<unsigned int>::add (i=1, this=0x7fc97c105140) at /home/marko/server/include/my_counter.h:27
          #2  Atomic_counter<unsigned int>::operator++ (this=0x7fc97c105140) at /home/marko/server/include/my_counter.h:36
          #3  dict_table_t::acquire (this=0x7fc97c104ef8) at /home/marko/server/storage/innobase/include/dict0dict.ic:1145
          #4  0x00005637db002c08 in dict_table_open_on_name (table_name=<optimized out>, table_name@entry=0x7fc98d9641f0 "test/oq_backing", dict_locked=dict_locked@entry=false, 
              ignore_err=ignore_err@entry=DICT_ERR_IGNORE_FK_NOKEY) at /home/marko/server/storage/innobase/dict/dict0dict.cc:1047
          #5  0x00005637dacaf5ac in ha_innobase::open_dict_table (norm_name=norm_name@entry=0x7fc98d9641f0 "test/oq_backing", is_partition=is_partition@entry=false, ignore_err=ignore_err@entry=DICT_ERR_IGNORE_FK_NOKEY)
              at /home/marko/server/storage/innobase/handler/ha_innodb.cc:6154
          #6  0x00005637dacc7e27 in ha_innobase::open (this=0x7fc97c0643c0, name=0x7fc97c05f830 "./test/oq_backing") at /home/marko/server/storage/innobase/handler/ha_innodb.cc:5868
          #7  0x00005637da8f2b89 in handler::ha_open (this=0x7fc97c0643c0, table_arg=table_arg@entry=0x7fc97c062db8, name=0x7fc97c05f830 "./test/oq_backing", mode=2, test_if_locked=test_if_locked@entry=18, 
              mem_root=mem_root@entry=0x0, partitions_to_open=0x0) at /home/marko/server/sql/handler.cc:3313
          #8  0x00005637da72d035 in open_table_from_share (thd=thd@entry=0x7fc97c001b08, share=share@entry=0x7fc97c062818, alias=alias@entry=0x7fc98d6198b0 <empty_lex_cstring>, db_stat=db_stat@entry=33, 
              prgflag=prgflag@entry=8, ha_open_flags=18, outparam=0x7fc97c062db8, is_create_table=false, partitions_to_open=0x0) at /home/marko/server/sql/table.cc:4327
          #9  0x00007fc98d5fbdc4 in ha_oqgraph::open (this=0x7fc97c062340, name=0x7fc97c05ca70 "./test/oq", mode=<optimized out>, test_if_locked=<optimized out>) at /home/marko/server/storage/oqgraph/ha_oqgraph.cc:627
          #10 0x00005637da8f2b89 in handler::ha_open (this=0x7fc97c062340, table_arg=table_arg@entry=0x7fc97c05e548, name=0x7fc97c05ca70 "./test/oq", mode=2, test_if_locked=test_if_locked@entry=18, 
              mem_root=mem_root@entry=0x0, partitions_to_open=0x0) at /home/marko/server/sql/handler.cc:3313
          #20 0x00005637da617f54 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7fc97c001b08, packet=packet@entry=0x7fc97c108ce9 "SELECT * FROM oq", packet_length=packet_length@entry=16, 
              blocking=blocking@entry=true) at /home/marko/server/sql/sql_class.h:1358
          

          We can see that the table handle on oq_backing was opened for accessing the table og. This looks like a locking bug to me. OQGraph fails to acquire MDL_SHARED on the table oq_backing whenever the table oq is being accessed. Such MDL_SHARED would correctly block any DDL operation on oq_backing.

          Before MDEV-25506 in 10.6, this situation might have been worked around by the "background DROP TABLE queue", that is, an old copy of the table oq_backing that was discarded in a table-rebuilding DDL operation could be silently renamed to start with an #sql-ib name and eventually dropped when the reference count reaches 0. If you had specified not only ALTER TABLE oq_backing FORCE but ,ALGORITHM=COPY then the crash might have been avoided in MariaDB Server versions earlier than 10.6.

          marko Marko Mäkelä added a comment - - edited For the first failure, I performed some analysis with ./mtr --rr : 10.6 76972163711f965402d51055f081ab51ae4a3bb7 (rr) reverse-continue Continuing.   Thread 2 hit Hardware watchpoint 1: -location table.n_ref_count   Old value = {m_counter = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 2}, <No data fields>}} New value = {m_counter = {<std::__atomic_base<unsigned int>> = {static _S_alignment = 4, _M_i = 1}, <No data fields>}} 0x00005637db00b258 in std::__atomic_base<unsigned int>::fetch_add (__m=std::memory_order_relaxed, __i=1, this=0x7fc97c105140) at /usr/include/c++/9/bits/atomic_base.h:539 539 fetch_add(__int_type __i, (rr) bt #0 0x00005637db00b258 in std::__atomic_base<unsigned int>::fetch_add (__m=std::memory_order_relaxed, __i=1, this=0x7fc97c105140) at /usr/include/c++/9/bits/atomic_base.h:539 #1 Atomic_counter<unsigned int>::add (i=1, this=0x7fc97c105140) at /home/marko/server/include/my_counter.h:27 #2 Atomic_counter<unsigned int>::operator++ (this=0x7fc97c105140) at /home/marko/server/include/my_counter.h:36 #3 dict_table_t::acquire (this=0x7fc97c104ef8) at /home/marko/server/storage/innobase/include/dict0dict.ic:1145 #4 0x00005637db002c08 in dict_table_open_on_name (table_name=<optimized out>, table_name@entry=0x7fc98d9641f0 "test/oq_backing", dict_locked=dict_locked@entry=false, ignore_err=ignore_err@entry=DICT_ERR_IGNORE_FK_NOKEY) at /home/marko/server/storage/innobase/dict/dict0dict.cc:1047 #5 0x00005637dacaf5ac in ha_innobase::open_dict_table (norm_name=norm_name@entry=0x7fc98d9641f0 "test/oq_backing", is_partition=is_partition@entry=false, ignore_err=ignore_err@entry=DICT_ERR_IGNORE_FK_NOKEY) at /home/marko/server/storage/innobase/handler/ha_innodb.cc:6154 #6 0x00005637dacc7e27 in ha_innobase::open (this=0x7fc97c0643c0, name=0x7fc97c05f830 "./test/oq_backing") at /home/marko/server/storage/innobase/handler/ha_innodb.cc:5868 #7 0x00005637da8f2b89 in handler::ha_open (this=0x7fc97c0643c0, table_arg=table_arg@entry=0x7fc97c062db8, name=0x7fc97c05f830 "./test/oq_backing", mode=2, test_if_locked=test_if_locked@entry=18, mem_root=mem_root@entry=0x0, partitions_to_open=0x0) at /home/marko/server/sql/handler.cc:3313 #8 0x00005637da72d035 in open_table_from_share (thd=thd@entry=0x7fc97c001b08, share=share@entry=0x7fc97c062818, alias=alias@entry=0x7fc98d6198b0 <empty_lex_cstring>, db_stat=db_stat@entry=33, prgflag=prgflag@entry=8, ha_open_flags=18, outparam=0x7fc97c062db8, is_create_table=false, partitions_to_open=0x0) at /home/marko/server/sql/table.cc:4327 #9 0x00007fc98d5fbdc4 in ha_oqgraph::open (this=0x7fc97c062340, name=0x7fc97c05ca70 "./test/oq", mode=<optimized out>, test_if_locked=<optimized out>) at /home/marko/server/storage/oqgraph/ha_oqgraph.cc:627 #10 0x00005637da8f2b89 in handler::ha_open (this=0x7fc97c062340, table_arg=table_arg@entry=0x7fc97c05e548, name=0x7fc97c05ca70 "./test/oq", mode=2, test_if_locked=test_if_locked@entry=18, mem_root=mem_root@entry=0x0, partitions_to_open=0x0) at /home/marko/server/sql/handler.cc:3313 … #20 0x00005637da617f54 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7fc97c001b08, packet=packet@entry=0x7fc97c108ce9 "SELECT * FROM oq", packet_length=packet_length@entry=16, blocking=blocking@entry=true) at /home/marko/server/sql/sql_class.h:1358 We can see that the table handle on oq_backing was opened for accessing the table og . This looks like a locking bug to me. OQGraph fails to acquire MDL_SHARED on the table oq_backing whenever the table oq is being accessed. Such MDL_SHARED would correctly block any DDL operation on oq_backing . Before MDEV-25506 in 10.6, this situation might have been worked around by the "background DROP TABLE queue", that is, an old copy of the table oq_backing that was discarded in a table-rebuilding DDL operation could be silently renamed to start with an #sql-ib name and eventually dropped when the reference count reaches 0. If you had specified not only ALTER TABLE oq_backing FORCE but ,ALGORITHM=COPY then the crash might have been avoided in MariaDB Server versions earlier than 10.6.

          People

            serg Sergei Golubchik
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.