Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26578

ERROR: AddressSanitizer: heap-use-after-free around dict_table_t::is_temporary_name

Details

    Description

      origin/bb-10.5-MDEV-25776 8d8c640963ed79ffc65d509c0c6e8466a5966ea8 2021-09-08T15:38:28+03:00
      Per Marko:
      The fix for MDEV-25776 is slightly imperfect.
      But the bad effect observed relates to FTS and cannot be caused by the fix.
      == The problem should be in actual 10.5 too.
       
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 328169][rr 2853077 328173]==2853077==ERROR: AddressSanitizer: heap-use-after-free on address 0x6040003a3710 at pc 0x000067625235 bp 0x6400008bb420 sp 0x6400008babc8
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 328176][rr 2853077 328178]READ of size 33 at 0x6040003a3710 thread T5
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351469]    #0 0x67625234  (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x57234)
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351471]    #1 0x676a7c10 in strstr (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xd9c10)
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351597]    #2 0x559832212179 in dict_table_t::is_temporary_name(char const*) /data/Server/bb-10.5-MDEV-25776/storage/innobase/include/dict0mem.h:1864
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351644]    #3 0x5598322123b1 in table_name_t::is_temporary() const /data/Server/bb-10.5-MDEV-25776/storage/innobase/include/dict0mem.h:2374
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351664]    #4 0x5598325903a9 in row_purge_parse_undo_rec /data/Server/bb-10.5-MDEV-25776/storage/innobase/row/row0purge.cc:938
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351716]    #5 0x559832591a0b in row_purge /data/Server/bb-10.5-MDEV-25776/storage/innobase/row/row0purge.cc:1107
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351718]    #6 0x559832591dad in row_purge_step(que_thr_t*) /data/Server/bb-10.5-MDEV-25776/storage/innobase/row/row0purge.cc:1159
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351734]    #7 0x559832476792 in que_thr_step /data/Server/bb-10.5-MDEV-25776/storage/innobase/que/que0que.cc:946
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351758]    #8 0x559832476bfe in que_run_threads_low /data/Server/bb-10.5-MDEV-25776/storage/innobase/que/que0que.cc:1008
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351779]    #9 0x55983247705d in que_run_threads(que_thr_t*) /data/Server/bb-10.5-MDEV-25776/storage/innobase/que/que0que.cc:1048
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 351924]    #10 0x559832672cc5 in trx_purge(unsigned long, bool) /data/Server/bb-10.5-MDEV-25776/storage/innobase/trx/trx0purge.cc:1286
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 352011]    #11 0x559832614d8c in srv_do_purge /data/Server/bb-10.5-MDEV-25776/storage/innobase/srv/srv0srv.cc:1948
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 352064]    #12 0x5598326159d9 in purge_coordinator_callback_low /data/Server/bb-10.5-MDEV-25776/storage/innobase/srv/srv0srv.cc:2048
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 352092]    #13 0x559832615c3d in purge_coordinator_callback /data/Server/bb-10.5-MDEV-25776/storage/innobase/srv/srv0srv.cc:2077
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 352096]    #14 0x559832a186a7 in tpool::task_group::execute(tpool::task*) /data/Server/bb-10.5-MDEV-25776/tpool/task_group.cc:55
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 352098]    #15 0x559832a18fae in tpool::task::execute() /data/Server/bb-10.5-MDEV-25776/tpool/task.cc:47
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 352140]2021-09-08 12:27:00 41 [Note] InnoDB: Online DDL : Start reading clustered index of the table and create temporary files
      # 2021-09-08T12:28:56 [2852137] | [rr 2853077 352182]    #16 0x559832a04f6c in tpool::thread_pool_generic::worker_main(tpool::worker_data*) /data/Server/bb-10.5-MDEV-25776/tpool/tpool_generic.cc:546
      sdp:/data/Results/1631107041/TBR-1193/dev/shm/vardir/1631107041/12/1/rr
       
      RQG
      ====
      git clone https://github.com/mleich1/rqg --branch experimental RQG
       
      perl rqg.pl \
      --grammar=conf/mariadb/table_stress_innodb.yy \
      --gendata=conf/mariadb/table_stress.zz \
      --gendata_sql=conf/mariadb/table_stress.sql \
      --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
      --mysqld=--loose-idle_write_transaction_timeout=0 \
      --mysqld=--loose-idle_transaction_timeout=0 \
      --mysqld=--loose-idle_readonly_transaction_timeout=0 \
      --mysqld=--connect_timeout=60 \
      --mysqld=--interactive_timeout=28800 \
      --mysqld=--slave_net_timeout=60 \
      --mysqld=--net_read_timeout=30 \
      --mysqld=--net_write_timeout=60 \
      --mysqld=--loose-table_lock_wait_timeout=50 \
      --mysqld=--wait_timeout=28800 \
      --mysqld=--lock-wait-timeout=86400 \
      --mysqld=--innodb-lock-wait-timeout=50 \
      --no-mask \
      --queries=10000000 \
      --seed=random \
      --reporters=Backtrace \
      --reporters=ErrorLog \
      --reporters=Deadlock1 \
      --validators=None \
      --mysqld=--log_output=none \
      --mysqld=--log_bin_trust_function_creators=1 \
      --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
      --engine=InnoDB \
      --restart_timeout=240 \
      --mysqld=--plugin-load-add=file_key_management.so \
      --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
      --duration=300 \
      --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
      --mysqld=--loose-innodb-sync-debug \
      --mysqld=--innodb_stats_persistent=on \
      --mysqld=--innodb_adaptive_hash_index=on \
      --mysqld=--log-bin \
      --mysqld=--sync-binlog=1 \
      --mysqld=--loose-innodb_evict_tables_on_commit_debug=off \
      --mysqld=--loose-max-statement-time=30 \
      --threads=33 \
      --mysqld=--innodb-use-native-aio=0 \
      --rr=Extended \
      --rr_options=--chaos --wait \
      --mysqld=--innodb_page_size=16K \
      --mysqld=--innodb-buffer-pool-size=256M \
      --no_mask \
      --workdir=<local settings> \
      --vardir=<local settings> \
      --mtr-build-thread=<local settings> \
      --basedir1=<local settings> \
      --script_debug=_nix_
      

      Attachments

        Issue Links

          Activity

            sdp:/data/Results/1631107041/002810.tar.xz
             
            Pattern for RQG
            [ 'TBR-1193-MDEV-26578', 'ERROR: AddressSanitizer: heap-use-after-free on address .+#2 .{1,30} in dict_table_t::is_temporary_name.+#3 .{1,30} in table_name_t::is_temporary.+#4 .{1,30} in row_purge_parse_undo_rec.+#5 .{1,30} in row_purge.+#6 .{1,30} in row_purge_step.+#7 .{1,30} in que_thr_step.+#8 .{1,30} in que_run_threads_low.+#9 .{1,30} in que_run_threads.+RESULT: The RQG run ended with status STATUS_SERVER_CRASHED' ],
            

            mleich Matthias Leich added a comment - sdp:/data/Results/1631107041/002810.tar.xz   Pattern for RQG [ 'TBR-1193-MDEV-26578', 'ERROR: AddressSanitizer: heap-use-after-free on address .+#2 .{1,30} in dict_table_t::is_temporary_name.+#3 .{1,30} in table_name_t::is_temporary.+#4 .{1,30} in row_purge_parse_undo_rec.+#5 .{1,30} in row_purge.+#6 .{1,30} in row_purge_step.+#7 .{1,30} in que_thr_step.+#8 .{1,30} in que_run_threads_low.+#9 .{1,30} in que_run_threads.+RESULT: The RQG run ended with status STATUS_SERVER_CRASHED' ],

            For the record, here is the stack trace of freeing the memory. After the SIGABRT, I simply set a watchpoint on the AddressSanitizer shadow byte, did reverse-continue (and continue) to break here:

            Thread 19 hit Hardware watchpoint 3: -location *(char*)0x0c088006c6e2
             
            Old value = 0 '\000'
            New value = -3 '\375'
            __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:274
            274	in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
            (rr) bt
            #0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:274
            #1  0x00000000675fa9d9 in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
            #2  0x00000000675f6deb in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5
            #3  0x00000000676dc0b2 in realloc () from /usr/lib/x86_64-linux-gnu/libasan.so.5
            #4  0x00005598328402d5 in dict_table_rename_in_cache (table=0x618000141908, new_name=0x783a75b0f020 "cool_down/FTS_", '0' <repeats 13 times>, "1b2_CONFIG", rename_also_foreigns=true, replace_new_file=false)
                at /data/Server/bb-10.5-MDEV-25776/storage/innobase/dict/dict0dict.cc:1623
            #5  0x00005598325522e1 in row_rename_table_for_mysql (old_name=0x783a75b0f2c0 "test/FTS_", '0' <repeats 13 times>, "1b2_CONFIG", new_name=0x783a75b0f020 "cool_down/FTS_", '0' <repeats 13 times>, "1b2_CONFIG", 
                trx=0x3b2b7cb82608, commit=false, use_fk=false) at /data/Server/bb-10.5-MDEV-25776/storage/innobase/row/row0mysql.cc:4515
            #6  0x000055983293c5a0 in fts_rename_one_aux_table (new_name=0x783a75b10fe0 "cool_down/t3", fts_table_old_name=0x783a75b0f2c0 "test/FTS_", '0' <repeats 13 times>, "1b2_CONFIG", trx=0x3b2b7cb82608)
                at /data/Server/bb-10.5-MDEV-25776/storage/innobase/fts/fts0fts.cc:1486
            …
            #17 0x000055983114e234 in mysql_parse (thd=0x62b0000ee218, rawbuf=0x62b0000f5238 "RENAME TABLE t3 TO cool_down.t3  /* E_R Thread1 QNO 186 CON_ID 16 */", length=68, parser_state=0x783a75b13cb0, 
            

            Because the internal FTS table name is being made longer (renamed from test.% to cool_down.%), realloc() will move the memory, and the pointer to the old name becomes invalid.

            I suspect that this bug could affect 10.6 as well. Nothing will acquire MDL_EXCLUSIVE on the internal table names on RENAME. In 10.6 with MDEV-25506 part 3, the purge_sys.stop_FTS() logic should only cover DROP-like operations, not RENAME.

            marko Marko Mäkelä added a comment - For the record, here is the stack trace of freeing the memory. After the SIGABRT, I simply set a watchpoint on the AddressSanitizer shadow byte, did reverse-continue (and continue ) to break here: Thread 19 hit Hardware watchpoint 3: -location *(char*)0x0c088006c6e2   Old value = 0 '\000' New value = -3 '\375' __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:274 274 in ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S (rr) bt #0 __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:274 #1 0x00000000675fa9d9 in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5 #2 0x00000000675f6deb in ?? () from /usr/lib/x86_64-linux-gnu/libasan.so.5 #3 0x00000000676dc0b2 in realloc () from /usr/lib/x86_64-linux-gnu/libasan.so.5 #4 0x00005598328402d5 in dict_table_rename_in_cache (table=0x618000141908, new_name=0x783a75b0f020 "cool_down/FTS_", '0' <repeats 13 times>, "1b2_CONFIG", rename_also_foreigns=true, replace_new_file=false) at /data/Server/bb-10.5-MDEV-25776/storage/innobase/dict/dict0dict.cc:1623 #5 0x00005598325522e1 in row_rename_table_for_mysql (old_name=0x783a75b0f2c0 "test/FTS_", '0' <repeats 13 times>, "1b2_CONFIG", new_name=0x783a75b0f020 "cool_down/FTS_", '0' <repeats 13 times>, "1b2_CONFIG", trx=0x3b2b7cb82608, commit=false, use_fk=false) at /data/Server/bb-10.5-MDEV-25776/storage/innobase/row/row0mysql.cc:4515 #6 0x000055983293c5a0 in fts_rename_one_aux_table (new_name=0x783a75b10fe0 "cool_down/t3", fts_table_old_name=0x783a75b0f2c0 "test/FTS_", '0' <repeats 13 times>, "1b2_CONFIG", trx=0x3b2b7cb82608) at /data/Server/bb-10.5-MDEV-25776/storage/innobase/fts/fts0fts.cc:1486 … #17 0x000055983114e234 in mysql_parse (thd=0x62b0000ee218, rawbuf=0x62b0000f5238 "RENAME TABLE t3 TO cool_down.t3 /* E_R Thread1 QNO 186 CON_ID 16 */", length=68, parser_state=0x783a75b13cb0, Because the internal FTS table name is being made longer (renamed from test.% to cool_down.% ), realloc() will move the memory, and the pointer to the old name becomes invalid. I suspect that this bug could affect 10.6 as well. Nothing will acquire MDL_EXCLUSIVE on the internal table names on RENAME . In 10.6 with MDEV-25506 part 3, the purge_sys.stop_FTS() logic should only cover DROP -like operations, not RENAME .

            10.6 and later releases are not affected by this, because the third and last part of MDEV-25506 removed the problematic call and introduced dict_table_t::mdl_name so that purge is able to acquire MDL on the original name of the table. Maybe that part should simply be backported to 10.5?

            marko Marko Mäkelä added a comment - 10.6 and later releases are not affected by this, because the third and last part of MDEV-25506 removed the problematic call and introduced dict_table_t::mdl_name so that purge is able to acquire MDL on the original name of the table. Maybe that part should simply be backported to 10.5?

            Patch is in bb-10.5-MDEV-26578

            thiru Thirunarayanan Balathandayuthapani added a comment - Patch is in bb-10.5- MDEV-26578

            I think that the proposed fix needs some stress testing. It may or may not work.

            marko Marko Mäkelä added a comment - I think that the proposed fix needs some stress testing. It may or may not work.

            The tree
            origin/bb-10.5-MDEV-26578 3f9dadbb1c613418790c449098dbcaa04e568dbd 2022-04-08T12:48:27+05:30
            behaved well in RQG testing. The bad effects met occur in the main tree too.

            mleich Matthias Leich added a comment - The tree origin/bb-10.5- MDEV-26578 3f9dadbb1c613418790c449098dbcaa04e568dbd 2022-04-08T12:48:27+05:30 behaved well in RQG testing. The bad effects met occur in the main tree too.

            People

              thiru Thirunarayanan Balathandayuthapani
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.