Details

    Description

      Problem found during RQG testing on ASAN build.
      One connection runs a statement stream consisting of
      1. One
          SET SESSION sql_mode = 'NO_ENGINE_SUBSTITUTION';
          CREATE OR REPLACE TEMPORARY TABLE t17 ( tcol10 TIME); # Table will use InnoDB
      2. Some long stream mixed up of
           ~ 20000  INSERT INTO t17 ( vcol2 ) SELECT vcol2 FROM t17 ;           # all fail correct with error 1054 Unknown column ...
           ~ 20000  ALTER TABLE t17 CHANGE COLUMN vcol2 tcol10 TIME;  # all fail correct with error 1054
           ~ 2000     TRUNCATE TABLE t17;                                                                      # all pass
      3. around that number 40000 statements or more I get
      2019-03-06 17:03:43 139904579504000 [Note] /work/10.2/bld_asan/sql/mysqld: ready for connections.
      Version: '10.2.23-MariaDB-debug-log'  socket: '/dev/shm/vardir/1551888201/46/1/mysql.sock'  port: 20200  Source distribution
      =================================================================
      ==185780==ERROR: AddressSanitizer: heap-use-after-free on address 0x6170001ece00 at pc 0x55d4bff5e299 bp 0x7f3df26d7c70 sp 0x7f3df26d7c60
      READ of size 1 at 0x6170001ece00 thread T17
          #0 0x55d4bff5e298 in ut_fold_string storage/innobase/include/ut0rnd.ic:168
          #1 0x55d4bff67140 in dict_table_check_if_in_cache_low storage/innobase/include/dict0priv.ic:120
          #2 0x55d4bff6c44d in dict_table_open_on_name(char const*, unsigned long, unsigned long, dict_err_ignore_t) storage/innobase/dict/dict0dict.cc:1165
          #3 0x55d4bfca2e0b in row_drop_table_for_mysql(char const*, trx_t*, enum_sql_command, bool, bool) storage/innobase/row/row0mysql.cc:3300
          #4 0x55d4bfc9f454 in row_drop_table_for_mysql_in_background storage/innobase/row/row0mysql.cc:2527
          #5 0x55d4bfc9fa2d in row_drop_tables_for_mysql_in_background() storage/innobase/row/row0mysql.cc:2594
          #6 0x55d4bfd6ccaf in srv_master_do_active_tasks storage/innobase/srv/srv0srv.cc:2201
          #7 0x55d4bfd6e206 in srv_master_thread storage/innobase/srv/srv0srv.cc:2432
          #8 0x7f3e110156d9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76d9)
          #9 0x7f3e104a7d7e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x108d7e)
       
      0x6170001ece00 is located 0 bytes inside of 655-byte region [0x6170001ece00,0x6170001ed08f)
      freed by thread T32 here:
          #0 0x7f3e11b4cb60 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc6b60)
          #1 0x55d4bffadc36 in dict_mem_table_free(dict_table_t*) storage/innobase/dict/dict0mem.cc:204
          #2 0x55d4bff72830 in dict_table_remove_from_cache_low(dict_table_t*, unsigned long) storage/innobase/dict/dict0dict.cc:2104
          #3 0x55d4bff728c0 in dict_table_remove_from_cache(dict_table_t*) storage/innobase/dict/dict0dict.cc:2114
          #4 0x55d4bfca4476 in row_drop_table_for_mysql(char const*, trx_t*, enum_sql_command, bool, bool) storage/innobase/row/row0mysql.cc:3693
          #5 0x55d4bfa59c79 in ha_innobase::delete_table(char const*, enum_sql_command) (bld_asan/sql/mysqld+0x1907c79)
          #6 0x55d4bfa2d9e3 in ha_innobase::truncate() storage/innobase/handler/ha_innodb.cc:13588
          #7 0x55d4bf532ba7 in handler::ha_truncate() sql/handler.cc:4129
          #8 0x55d4bf9733b2 in Sql_cmd_truncate_table::handler_truncate(THD*, TABLE_LIST*, bool) sql/sql_truncate.cc:245
          #9 0x55d4bf9741f8 in Sql_cmd_truncate_table::truncate_table(THD*, TABLE_LIST*) sql/sql_truncate.cc:403
          #10 0x55d4bf974863 in Sql_cmd_truncate_table::execute(THD*) sql/sql_truncate.cc:499
          #11 0x55d4bef56337 in mysql_execute_command(THD*) sql/sql_parse.cc:6231
          #12 0x55d4bef60dd1 in mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) sql/sql_parse.cc:8018
          #13 0x55d4bef3b909 in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) sql/sql_parse.cc:1829
          #14 0x55d4bef388b9 in do_command(THD*) sql/sql_parse.cc:1378
          #15 0x55d4bf26e5af in do_handle_one_connection(CONNECT*) sql/sql_connect.cc:1336
          #16 0x55d4bf26df94 in handle_one_connection sql/sql_connect.cc:1242
          #17 0x7f3e110156d9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76d9)
       
      previously allocated by thread T32 here:
          #0 0x7f3e11b4d270 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc7270)
          #1 0x55d4bff6fcb7 in dict_table_rename_in_cache(dict_table_t*, char const*, bool, bool) storage/innobase/dict/dict0dict.cc:1730
          #2 0x55d4bfca7887 in row_rename_table_for_mysql(char const*, char const*, trx_t*, bool, bool) storage/innobase/row/row0mysql.cc:4519
          #3 0x55d4bfa5a7e0 in innobase_rename_table(trx_t*, char const*, char const*, bool, bool) (bld_asan/sql/mysqld+0x19087e0)
          #4 0x55d4bfa2d645 in ha_innobase::truncate() storage/innobase/handler/ha_innodb.cc:13550
          #5 0x55d4bf532ba7 in handler::ha_truncate() sql/handler.cc:4129
          #6 0x55d4bf9733b2 in Sql_cmd_truncate_table::handler_truncate(THD*, TABLE_LIST*, bool) sql/sql_truncate.cc:245
          #7 0x55d4bf9741f8 in Sql_cmd_truncate_table::truncate_table(THD*, TABLE_LIST*) sql/sql_truncate.cc:403
          #8 0x55d4bf974863 in Sql_cmd_truncate_table::execute(THD*) sql/sql_truncate.cc:499
          #9 0x55d4bef56337 in mysql_execute_command(THD*) sql/sql_parse.cc:6231
          #10 0x55d4bef60dd1 in mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) sql/sql_parse.cc:8018
          #11 0x55d4bef3b909 in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) sql/sql_parse.cc:1829
          #12 0x55d4bef388b9 in do_command(THD*) sql/sql_parse.cc:1378
          #13 0x55d4bf26e5af in do_handle_one_connection(CONNECT*) sql/sql_connect.cc:1336
          #14 0x55d4bf26df94 in handle_one_connection sql/sql_connect.cc:1242
          #15 0x7f3e110156d9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76d9)
      Thread T17 created by T0 here:
          #0 0x7f3e11ab74e8 in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x314e8)
          #1 0x55d4bfba081e in os_thread_create_func(void* (*)(void*), void*, unsigned long*) storage/innobase/os/os0thread.cc:132
          #2 0x55d4bfd800ec in innobase_start_or_create_for_mysql() storage/innobase/srv/srv0start.cc:2620
          #3 0x55d4bfa016e2 in innobase_init storage/innobase/handler/ha_innodb.cc:4377
          #4 0x55d4bf51d13f in ha_initialize_handlerton(st_plugin_int*) sql/handler.cc:521
          #5 0x55d4bef7be8b in plugin_initialize sql/sql_plugin.cc:1416
          #6 0x55d4bef7d5e1 in plugin_init(int*, char**, int) sql/sql_plugin.cc:1697
          #7 0x55d4bed535fa in init_server_components sql/mysqld.cc:5295
          #8 0x55d4bed55362 in mysqld_main(int, char**) sql/mysqld.cc:5891
          #9 0x55d4bed40b6f in main sql/main.cc:25
          #10 0x7f3e103bf3f0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x203f0)
       
      Thread T32 created by T0 here:
          #0 0x7f3e11ab74e8 in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x314e8)
          #1 0x55d4c05ebf22 in spawn_thread_noop mysys/psi_noop.c:187
          #2 0x55d4bed421e2 in inline_mysql_thread_create include/mysql/psi/mysql_thread.h:1239
          #3 0x55d4bed5670a in create_thread_to_handle_connection(CONNECT*) sql/mysqld.cc:6466
          #4 0x55d4bed56e01 in create_new_thread sql/mysqld.cc:6536
          #5 0x55d4bed57e28 in handle_connections_sockets() sql/mysqld.cc:6811
          #6 0x55d4bed55c57 in mysqld_main(int, char**) sql/mysqld.cc:6085
          #7 0x55d4bed40b6f in main sql/main.cc:25
          #8 0x7f3e103bf3f0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x203f0)
       
      SUMMARY: AddressSanitizer: heap-use-after-free storage/innobase/include/ut0rnd.ic:168 in ut_fold_string
      Shadow bytes around the buggy address:
        0x0c2e80035970: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2e80035980: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2e80035990: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2e800359a0: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        0x0c2e800359b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      =>0x0c2e800359c0:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2e800359d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2e800359e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2e800359f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2e80035a00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c2e80035a10: fd fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      Shadow byte legend (one shadow byte represents 8 application bytes):
        Addressable:           00
        Partially addressable: 01 02 03 04 05 06 07
        Heap left redzone:       fa
        Heap right redzone:      fb
        Freed heap region:       fd
        Stack left redzone:      f1
        Stack mid redzone:       f2
        Stack right redzone:     f3
        Stack partial redzone:   f4
        Stack after return:      f5
        Stack use after scope:   f8
        Global redzone:          f9
        Global init order:       f6
        Poisoned by user:        f7
        Container overflow:      fc
        Array cookie:            ac
        Intra object redzone:    bb
        ASan internal:           fe
        Left alloca redzone:     ca
        Right alloca redzone:    cb
      ==185780==ABORTING
      190306 17:04:11 [ERROR] mysqld got signal 6 ;
       
      It looks like the problem was introduced by
      10.2 ca76fc4a3a1c5f393e1e34005975582e73e84365 2019-02-19T11:14:03+02:00
      No replay with
      10.2 346e46089621e6951e076c82ed5690aa23dcb5fe 2019-02-19T10:51:34 2019 +0200
       
      Up till today the attempts to write some MTR based test for replay had no luck.
      Even single RQG tests do not replay all time.
      Several RQG tests in parallel replay quite good.
      
      

      Attachments

        1. MDEV-18836.log.tgz
          149 kB
        2. MDEV-18836.tgz
          11 kB
        3. simp_asan.yy
          1 kB

        Issue Links

          Activity

            I do not think that this regression can be due to the indicated commit; it must be something older in 10.2. The "working" commit ID is for 10.1, and it was merged by me to 10.2 straight after the indicated 10.2 commit. Coincidentally, this merge has both parent commit IDs that you mention: the 10.2 commit where the heap-use-after-free is present, and the 10.1 commit where you were not able to repeat the issue.

            marko Marko Mäkelä added a comment - I do not think that this regression can be due to the indicated commit; it must be something older in 10.2. The "working" commit ID is for 10.1, and it was merged by me to 10.2 straight after the indicated 10.2 commit . Coincidentally, this merge has both parent commit IDs that you mention: the 10.2 commit where the heap-use-after-free is present, and the 10.1 commit where you were not able to repeat the issue.
            mleich Matthias Leich added a comment - - edited

            Sorry for picking the wrong commit.
             
            10.2 commit 2027841d5bc7e22ebb24a5a7538d0fd057dd8352 2018-12-16
            , the server reports to be a 10.2.20-MariaDB-debug, seems to have introduced the problem.
             
            Replay on (ASAN builds)
            - 10.2 commit b4cda8bbbc23dbf8a7d59d06de390ea35695efe7 2019-03-07
            - mariadb-10.2.20 (git log says commit 975f4a1295f2b678c5ecea2cf4cfc69e177d88f5)
             
            No replay on (ASAN build)
            10.3.14 commit d30f17af4969cc1ce34f1925f5ea2bced9c6f7e9 2019-03-05
            10.4  commit 5f34513c2a2dd5f8431cf03f6ba083c42f233dca 2019-03-06
             
            Result of 300 RQG runs with simplified RQG grammar and some debug build without ASAN
               10.2 commit b4cda8bbbc23dbf8a7d59d06de390ea35695efe7 2019-03-07
            (maybe it shows whatever suspicious results): all tests pass
             
            So please feel free to fix in 10.2 or not.
            
            

            mleich Matthias Leich added a comment - - edited Sorry for picking the wrong commit.   10.2 commit 2027841d5bc7e22ebb24a5a7538d0fd057dd8352 2018-12-16 , the server reports to be a 10.2.20-MariaDB-debug, seems to have introduced the problem.   Replay on (ASAN builds) - 10.2 commit b4cda8bbbc23dbf8a7d59d06de390ea35695efe7 2019-03-07 - mariadb-10.2.20 (git log says commit 975f4a1295f2b678c5ecea2cf4cfc69e177d88f5)   No replay on (ASAN build) 10.3.14 commit d30f17af4969cc1ce34f1925f5ea2bced9c6f7e9 2019-03-05 10.4 commit 5f34513c2a2dd5f8431cf03f6ba083c42f233dca 2019-03-06   Result of 300 RQG runs with simplified RQG grammar and some debug build without ASAN 10.2 commit b4cda8bbbc23dbf8a7d59d06de390ea35695efe7 2019-03-07 (maybe it shows whatever suspicious results): all tests pass   So please feel free to fix in 10.2 or not.

            MDEV-18836.tgz contains some files for replaying the problem.
            You need my improved RQG+Simplifier.
            git clone https://github.com/mleich1/rqg --branch experimental RQG_mleich1
            cd RQG_mleich1

            mleich Matthias Leich added a comment - MDEV-18836 .tgz contains some files for replaying the problem. You need my improved RQG+Simplifier. git clone https://github.com/mleich1/rqg --branch experimental RQG_mleich1 cd RQG_mleich1

            There appears to be a race condition between ha_innobase::truncate() and row_drop_table_for_mysql_in_background() on TRUNCATE TABLE.

            This race condition does not happen to be easily repeatable in MariaDB Server 10.3 or 10.4, presumably because dict_mem_create_temporary_tablename() does not involve inter-thread synchronization. We can simplify that function for the case innodb_safe_truncate=ON in MariaDB Server 10.2, but we should also fix the race condition.

            marko Marko Mäkelä added a comment - There appears to be a race condition between ha_innobase::truncate() and row_drop_table_for_mysql_in_background() on TRUNCATE TABLE . This race condition does not happen to be easily repeatable in MariaDB Server 10.3 or 10.4, presumably because dict_mem_create_temporary_tablename() does not involve inter-thread synchronization. We can simplify that function for the case innodb_safe_truncate=ON in MariaDB Server 10.2, but we should also fix the race condition.

            I believe that the problem is that the table can be dropped (and the cached metadata freed) between the time row_drop_tables_for_mysql_in_background() closes the table handle and tries to drop the table by name by invoking row_drop_table_for_mysql_in_background(). We must copy the table name before closing the table handle. Normally, the table name should be inaccessible to users, because it should have been renamed to an #sql-ib name before the table was added to the background drop queue.

            marko Marko Mäkelä added a comment - I believe that the problem is that the table can be dropped (and the cached metadata freed) between the time row_drop_tables_for_mysql_in_background() closes the table handle and tries to drop the table by name by invoking row_drop_table_for_mysql_in_background() . We must copy the table name before closing the table handle. Normally, the table name should be inaccessible to users, because it should have been renamed to an #sql-ib name before the table was added to the background drop queue.

            People

              marko Marko Mäkelä
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.