Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29962

SIGSEGV in ha_spider::lock_tables on BEGIN after table lock

Details

    Description

      Not a new regression. The crash is present in optimized builds only, debug builds seem to work fine.

      INSTALL PLUGIN Spider SONAME 'ha_spider.so';
      CREATE USER Spider@localhost IDENTIFIED BY '';
      CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET '../socket.sock',DATABASE 'test',user 'Spider',PASSWORD '');
      CREATE TABLE t (c INT) ENGINE=InnoDB;
      CREATE TABLE t1 (c INT) ENGINE=Spider COMMENT='WRAPPER "mysql",srv "srv",TABLE "t"';
      SELECT * FROM t1;
      LOCK TABLES t1 WRITE CONCURRENT,t1 AS t2 READ;
      BEGIN;
      

      Leads to:

      10.11.1 50c5743adc87e1cdec1431a02558f6540fe5a6d5 (Optimized)

      Core was generated by `/test/MD221022-mariadb-10.11.1-linux-x86_64-opt/bin/mysqld --no-defaults --core'.
      Program terminated with signal SIGSEGV, Segmentation fault.
      #0  0x0000152d3d82cd4e in ha_spider::lock_tables (
          this=this@entry=0x152ccc04bb50)
          at /test/10.11_opt/storage/spider/ha_spider.cc:12184
      12184	        if (
      [Current thread is 1 (Thread 0x152d3d8d6700 (LWP 354245))]
      (gdb) bt
      #0  0x0000152d3d82cd4e in ha_spider::lock_tables (this=this@entry=0x152ccc04bb50) at /test/10.11_opt/storage/spider/ha_spider.cc:12184
      #1  0x0000152d3d82d200 in ha_spider::external_lock (this=0x152ccc04bb50, thd=<optimized out>, lock_type=2) at /test/10.11_opt/storage/spider/ha_spider.cc:921
      #2  0x000055aa825b5d78 in handler::ha_external_lock (this=0x152ccc04bb50, thd=thd@entry=0x152ccc000c58, lock_type=lock_type@entry=2) at /test/10.11_opt/sql/handler.cc:7095
      #3  0x000055aa826cb55c in handler::ha_external_unlock (thd=0x152ccc000c58, this=<optimized out>) at /test/10.11_opt/sql/handler.h:3538
      #4  unlock_external (count=<optimized out>, table=0x152ccc0813f8, thd=0x152ccc000c58) at /test/10.11_opt/sql/lock.cc:744
      #5  mysql_unlock_tables (thd=0x152ccc000c58, sql_lock=0x152ccc0813b8, free_lock=<optimized out>) at /test/10.11_opt/sql/lock.cc:435
      #6  0x000055aa826cbad8 in mysql_unlock_tables (thd=thd@entry=0x152ccc000c58, sql_lock=<optimized out>) at /test/10.11_opt/sql/lock.cc:418
      #7  0x000055aa822f2209 in close_thread_tables (thd=thd@entry=0x152ccc000c58) at /test/10.11_opt/sql/sql_base.cc:953
      #8  0x000055aa822f307b in Locked_tables_list::unlock_locked_tables (thd=0x152ccc000c58, this=0x152ccc004af8) at /test/10.11_opt/sql/sql_base.cc:2571
      #9  Locked_tables_list::unlock_locked_tables (this=this@entry=0x152ccc004af8, thd=thd@entry=0x152ccc000c58) at /test/10.11_opt/sql/sql_base.cc:2542
      #10 0x000055aa8248f28d in trans_begin (thd=thd@entry=0x152ccc000c58, flags=0) at /test/10.11_opt/sql/transaction.cc:115
      #11 0x000055aa8236a1ae in mysql_execute_command (thd=0x152ccc000c58, is_called_from_prepared_stmt=<optimized out>) at /test/10.11_opt/sql/sql_parse.cc:5605
      #12 0x000055aa82358335 in mysql_parse (rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>, thd=0x152ccc000c58) at /test/10.11_opt/sql/sql_parse.cc:8023
      #13 mysql_parse (thd=0x152ccc000c58, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /test/10.11_opt/sql/sql_parse.cc:7945
      #14 0x000055aa823640ea in dispatch_command (command=COM_QUERY, thd=0x152ccc000c58, packet=<optimized out>, packet_length=<optimized out>, blocking=<optimized out>) at /test/10.11_opt/sql/sql_class.h:1346
      #15 0x000055aa82365ee2 in do_command (thd=0x152ccc000c58, blocking=blocking@entry=true) at /test/10.11_opt/sql/sql_parse.cc:1407
      #16 0x000055aa8247ffbf in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55aa8599ac18, put_in_cache=put_in_cache@entry=true) at /test/10.11_opt/sql/sql_connect.cc:1416
      #17 0x000055aa8248029d in handle_one_connection (arg=0x55aa8599ac18) at /test/10.11_opt/sql/sql_connect.cc:1318
      #18 0x0000152d63b56609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #19 0x0000152d63742133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      

      Bug confirmed present in:
      MariaDB: 10.5.18 (opt), 10.6.10 (opt), 10.7.6 (opt), 10.8.5 (opt), 10.9.3 (opt), 10.10.2 (opt), 10.11.1 (opt)

      Bug (or feature/syntax) confirmed not present in:
      MariaDB: 10.3.37 (dbg), 10.3.37 (opt), 10.4.27 (dbg), 10.4.27 (opt), 10.5.18 (dbg), 10.6.10 (dbg), 10.7.6 (dbg), 10.8.5 (dbg), 10.9.3 (dbg), 10.10.2 (dbg), 10.11.1 (dbg)
      MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.38 (dbg), 5.7.38 (opt), 8.0.29 (dbg), 8.0.29 (opt)

      Attachments

        Issue Links

          Activity

            Roel Roel Van de Paar added a comment - - edited

            I made a matching MTR test as well:

            --source include/have_innodb.inc
            --let $SOCKET= `SELECT @@global.socket`
            INSTALL PLUGIN Spider SONAME 'ha_spider.so';
            CREATE USER Spider@localhost IDENTIFIED BY '';
            eval CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET "$SOCKET",DATABASE 'test',user 'Spider',PASSWORD '');
            CREATE TABLE t (c INT) ENGINE=InnoDB;
            CREATE TABLE t1 (c INT) ENGINE=Spider COMMENT='WRAPPER "mysql",srv "srv",TABLE "t"';
            SELECT * FROM t1;
            LOCK TABLES t1 WRITE CONCURRENT,t1 AS t2 READ;
            BEGIN;
            

            Which crashes in the same way. When ENGINE=InnoDB clause is removed (i.e. MyISAM default for MTR), it produces another outcome:

            10.11.1 50c5743adc87e1cdec1431a02558f6540fe5a6d5 (Optimized)

            CURRENT_TEST: main.test
            mysqltest: At line 10: query 'BEGIN' failed: ER_GET_ERRMSG (1296): Got error 3 'unknown' from SPIDER
            

            Roel Roel Van de Paar added a comment - - edited I made a matching MTR test as well: --source include/have_innodb.inc --let $SOCKET= `SELECT @@global.socket` INSTALL PLUGIN Spider SONAME 'ha_spider.so' ; CREATE USER Spider@localhost IDENTIFIED BY '' ; eval CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET "$SOCKET" , DATABASE 'test' , user 'Spider' , PASSWORD '' ); CREATE TABLE t (c INT ) ENGINE=InnoDB; CREATE TABLE t1 (c INT ) ENGINE=Spider COMMENT= 'WRAPPER "mysql",srv "srv",TABLE "t"' ; SELECT * FROM t1; LOCK TABLES t1 WRITE CONCURRENT,t1 AS t2 READ ; BEGIN ; Which crashes in the same way. When ENGINE=InnoDB clause is removed (i.e. MyISAM default for MTR), it produces another outcome: 10.11.1 50c5743adc87e1cdec1431a02558f6540fe5a6d5 (Optimized) CURRENT_TEST: main.test mysqltest: At line 10: query 'BEGIN' failed: ER_GET_ERRMSG (1296): Got error 3 'unknown' from SPIDER
            Roel Roel Van de Paar added a comment - - edited

            When trying the alternative MTR-like (i.e. using MYISAM) testcase at the CLI:

            INSTALL PLUGIN Spider SONAME 'ha_spider.so';
            CREATE USER Spider@localhost IDENTIFIED BY '';
            CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET '../socket.sock',DATABASE 'test',user 'Spider',PASSWORD '');
            CREATE TABLE t (c INT) ENGINE=MyISAM;
            CREATE TABLE t1 (c INT) ENGINE=Spider COMMENT='WRAPPER "mysql",srv "srv",TABLE "t"';
            SELECT * FROM t1;
            LOCK TABLES t1 WRITE CONCURRENT,t1 AS t2 READ;
            BEGIN;
            

            We get the same crash, so it is interesting that MTR fails in a different way when using MyISAM.

            Roel Roel Van de Paar added a comment - - edited When trying the alternative MTR-like (i.e. using MYISAM) testcase at the CLI: INSTALL PLUGIN Spider SONAME 'ha_spider.so' ; CREATE USER Spider@localhost IDENTIFIED BY '' ; CREATE SERVER srv FOREIGN DATA WRAPPER MYSQL OPTIONS (SOCKET '../socket.sock' , DATABASE 'test' , user 'Spider' , PASSWORD '' ); CREATE TABLE t (c INT ) ENGINE=MyISAM; CREATE TABLE t1 (c INT ) ENGINE=Spider COMMENT= 'WRAPPER "mysql",srv "srv",TABLE "t"' ; SELECT * FROM t1; LOCK TABLES t1 WRITE CONCURRENT,t1 AS t2 READ ; BEGIN ; We get the same crash, so it is interesting that MTR fails in a different way when using MyISAM .
            ycp Yuchen Pei added a comment -

            Hi holyfoot, ptal thanks

            commit a5e1be90a1b22569cf1ec33f0cee57d481385e3a (bb-10.10-mdev-29962)
            Author: Yuchen Pei <ycp@mariadb.com>
            Date:   Wed Oct 11 18:04:40 2023 +1100
             
                MDEV-29962 Spider: check trx and get conn before lock_tables()
                
                Same cause as MDEV-31996. This prevents reuse of conns freed
                previously, e.g. commit of previous statement.
            

            The above is based on 11.0. Also tested on

            • 10.5 867ff4dc29ac19f0691780f674315ff7098ec87a
            • 10.10 a5e1be90a1b22569cf1ec33f0cee57d481385e3a
            • enterprise-23.08 1d01b86ee157ec1e0b11c5f11aa63bb39321a227
            ycp Yuchen Pei added a comment - Hi holyfoot , ptal thanks commit a5e1be90a1b22569cf1ec33f0cee57d481385e3a (bb-10.10-mdev-29962) Author: Yuchen Pei <ycp@mariadb.com> Date: Wed Oct 11 18:04:40 2023 +1100   MDEV-29962 Spider: check trx and get conn before lock_tables() Same cause as MDEV-31996. This prevents reuse of conns freed previously, e.g. commit of previous statement. The above is based on 11.0. Also tested on 10.5 867ff4dc29ac19f0691780f674315ff7098ec87a 10.10 a5e1be90a1b22569cf1ec33f0cee57d481385e3a enterprise-23.08 1d01b86ee157ec1e0b11c5f11aa63bb39321a227
            ycp Yuchen Pei added a comment - - edited

            Sorry, I need to re-work this, because the problem in MDEV-27902
            also appears in 10.4, and a consideration of choosing between
            dml_init() and check_trx_and_get_conn() needs to be made. Will need
            to check MDEV-31996 and MDEV-28683 for this too.

            Here's the case for 10.4, it is the same as the original case and
            two other cases in MDEV-27902, but it fails at
            ha_spider::external_lock() for the same reason, and a fix
            (cd86deb2e98e133ce5d287e92ca048e04b4726b9) like the one in the
            previous comment which adds check_trx_and_get_conn() in
            ha_spider::external_lock() fixes it.

            # original case
            CREATE TABLE t (c INT) ENGINE=Spider;
            HANDLER t OPEN;
            --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE
            HANDLER t READ next;
            --error ER_PARSE_ERROR
            dummy;
            --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE
            HANDLER t READ next;
             
            drop table t;
             
            # case by nayuta
            CREATE TABLE t (c INT) ENGINE=Spider;
            HANDLER t OPEN;
            --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE
            HANDLER t READ FIRST;
            --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE
            HANDLER t READ NEXT;
             
            drop table t;
             
            # Another case by Roel
            CREATE TABLE t (c INT) ENGINE=Spider;
            HANDLER t OPEN;
            --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE
            HANDLER t READ NEXT;
            --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE
            SELECT * FROM t;
            --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE
            HANDLER t READ NEXT;
             
            drop table t;

            We also need to check why the heap-use-after-free happens at
            different time between 10.4 and higher versions.

            We will use this ticket to fix all heap-use-after-free of
            connections, including MDEV-27902 and possibly others.

            ycp Yuchen Pei added a comment - - edited Sorry, I need to re-work this, because the problem in MDEV-27902 also appears in 10.4, and a consideration of choosing between dml_init() and check_trx_and_get_conn() needs to be made. Will need to check MDEV-31996 and MDEV-28683 for this too. Here's the case for 10.4, it is the same as the original case and two other cases in MDEV-27902 , but it fails at ha_spider::external_lock() for the same reason, and a fix (cd86deb2e98e133ce5d287e92ca048e04b4726b9) like the one in the previous comment which adds check_trx_and_get_conn() in ha_spider::external_lock() fixes it. # original case CREATE TABLE t (c INT ) ENGINE=Spider; HANDLER t OPEN ; --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE HANDLER t READ next ; --error ER_PARSE_ERROR dummy; --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE HANDLER t READ next ;   drop table t;   # case by nayuta CREATE TABLE t (c INT ) ENGINE=Spider; HANDLER t OPEN ; --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE HANDLER t READ FIRST ; --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE HANDLER t READ NEXT ;   drop table t;   # Another case by Roel CREATE TABLE t (c INT ) ENGINE=Spider; HANDLER t OPEN ; --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE HANDLER t READ NEXT ; --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE SELECT * FROM t; --error ER_CONNECT_TO_FOREIGN_DATA_SOURCE HANDLER t READ NEXT ;   drop table t; We also need to check why the heap-use-after-free happens at different time between 10.4 and higher versions. We will use this ticket to fix all heap-use-after-free of connections, including MDEV-27902 and possibly others.
            ycp Yuchen Pei added a comment - - edited

            I think we should just take the simple approach, like in MDEV-28683,
            and call spider_check_trx_and_get_conn() when needed.

            I'll take a look at the problem similar to MDEV-26540 when running the
            MDEV-27902 case in 10.4 (see also comment[1] there), to rule out any
            issues that can be relevant to this one.

            [1] https://jira.mariadb.org/browse/MDEV-26540?focusedCommentId=271610&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-271610

            ycp Yuchen Pei added a comment - - edited I think we should just take the simple approach, like in MDEV-28683 , and call spider_check_trx_and_get_conn() when needed. I'll take a look at the problem similar to MDEV-26540 when running the MDEV-27902 case in 10.4 (see also comment [1] there), to rule out any issues that can be relevant to this one. [1] https://jira.mariadb.org/browse/MDEV-26540?focusedCommentId=271610&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-271610
            ycp Yuchen Pei added a comment -

            Hi holyfoot, ptal thanks

            upstream/bb-11.0-mdev-29962 617abe5303ffae3270f85bd1dee83bdfc5f63ee1
            MDEV-29962 Spider: creates connections if needed before lock_tables()
             
            Same cause as MDEV-31996. This prevents reuse of conns freed
            previously, e.g. during the commit of previous statement.
             
            It does not occur in 10.4 because of the commit for MDEV-19002 removed
            the call to spider_check_trx_and_get_conn().
            

            for a 10.5 patch, see

            d1110a81484 upstream/bb-10.5-mdev-29962 MDEV-29962 Spider: creates connections if needed before lock_tables()

            ycp Yuchen Pei added a comment - Hi holyfoot , ptal thanks upstream/bb-11.0-mdev-29962 617abe5303ffae3270f85bd1dee83bdfc5f63ee1 MDEV-29962 Spider: creates connections if needed before lock_tables()   Same cause as MDEV-31996. This prevents reuse of conns freed previously, e.g. during the commit of previous statement.   It does not occur in 10.4 because of the commit for MDEV-19002 removed the call to spider_check_trx_and_get_conn(). for a 10.5 patch, see d1110a81484 upstream/bb-10.5-mdev-29962 MDEV-29962 Spider: creates connections if needed before lock_tables()

            please check if the spider_check_trx_and_get_conn() can only be called
            when (lock_type != F_UNLCK).

            holyfoot Alexey Botchkov added a comment - please check if the spider_check_trx_and_get_conn() can only be called when (lock_type != F_UNLCK).
            ycp Yuchen Pei added a comment - - edited

            holyfoot, if I do that (i.e. apply the following diff on top, see commit 1b8db551a7d5dba054e80e74682c47899ac1b2af) I get heap-use-after-free when testing spider/bugfix.mdev_29962, so I think the function should be called regardless.

            modified   storage/spider/ha_spider.cc
            @@ -897,8 +897,9 @@ int ha_spider::external_lock(
                 }
               }
             
            -  if ((error_num= spider_check_trx_and_get_conn(thd, this)))
            -    DBUG_RETURN(error_num);
            +  if (lock_type != F_UNLCK)
            +    if ((error_num= spider_check_trx_and_get_conn(thd, this)))
            +      DBUG_RETURN(error_num);
               if (!partition_handler || !partition_handler->handlers)
               {
                 DBUG_RETURN(lock_tables()); /* Non-partitioned table */
            

            ycp Yuchen Pei added a comment - - edited holyfoot , if I do that (i.e. apply the following diff on top, see commit 1b8db551a7d5dba054e80e74682c47899ac1b2af) I get heap-use-after-free when testing spider/bugfix.mdev_29962, so I think the function should be called regardless. modified storage/spider/ha_spider.cc @@ -897,8 +897,9 @@ int ha_spider::external_lock( } } - if ((error_num= spider_check_trx_and_get_conn(thd, this))) - DBUG_RETURN(error_num); + if (lock_type != F_UNLCK) + if ((error_num= spider_check_trx_and_get_conn(thd, this))) + DBUG_RETURN(error_num); if (!partition_handler || !partition_handler->handlers) { DBUG_RETURN(lock_tables()); /* Non-partitioned table */
            ycp Yuchen Pei added a comment - - edited

            I've checked this issue with its related issue MDEV-27902 etc. and the patch seems still alright.

            Hi holyfoot, ptal thanks (basically the same patch)

            15ef5566135 upstream/bb-10.5-mdev-29962 MDEV-29962 Spider: creates connections if needed before lock_tables()
            

            ycp Yuchen Pei added a comment - - edited I've checked this issue with its related issue MDEV-27902 etc. and the patch seems still alright. Hi holyfoot , ptal thanks (basically the same patch) 15ef5566135 upstream/bb-10.5-mdev-29962 MDEV-29962 Spider: creates connections if needed before lock_tables()

            See MDEV-34590 for an ASAN heap-use-after-free in ha_spider::lock_tables, which may be caused by this bug, as this bug exists in 10.5, unlike the bug described in MDEV-34590 which is 10.6+ only.

            Roel Roel Van de Paar added a comment - See MDEV-34590 for an ASAN heap-use-after-free in ha_spider::lock_tables, which may be caused by this bug, as this bug exists in 10.5, unlike the bug described in MDEV-34590 which is 10.6+ only.

            ok to push.

            holyfoot Alexey Botchkov added a comment - ok to push.
            ycp Yuchen Pei added a comment -

            Thanks for the review - pushed 0bb9862888567a5304aec18e85bf6dd002fc25d3 to 10.5

            ycp Yuchen Pei added a comment - Thanks for the review - pushed 0bb9862888567a5304aec18e85bf6dd002fc25d3 to 10.5

            People

              ycp Yuchen Pei
              Roel Roel Van de Paar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.