Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31886

Server fails to shut down storage engine plugins when thread_stack is set to the smallest negative value

Details

    • Bug
    • Status: Confirmed (View Workflow)
    • Minor
    • Resolution: Unresolved
    • 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL), 10.11, 11.0(EOL), 11.1(EOL), 11.2(EOL)
    • 10.5, 10.6, 10.11
    • None

    Description

      # mysqld options required for replay: --thread-stack=-5000
      

      The crash is happening only on 10.4 build.

      Leads to

      10.4.31 161ce045a71e306768d4609bdc35788fa5ea2a71 (Debug)

      mariadbd: /test/10.4_dbg/storage/innobase/sync/sync0debug.cc:1506: CreateTracker::~CreateTracker(): Assertion `m_files.empty()' failed.
      

      10.4.31 161ce045a71e306768d4609bdc35788fa5ea2a71 (Debug)

      Core was generated by `/test/MD090823-mariadb-10.4.31-linux-x86_64-dbg/bin/mariadbd --no-defaults --co'.
      Program terminated with signal SIGABRT, Aborted.
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      [Current thread is 1 (Thread 0x152141918900 (LWP 4028037))]
      (gdb) bt
      #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #1  0x0000152141af1859 in __GI_abort () at abort.c:79
      #2  0x0000152141af1729 in __assert_fail_base (fmt=0x152141c87588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x559a854316d5 "m_files.empty()", file=0x559a854078f0 "/test/10.4_dbg/storage/innobase/sync/sync0debug.cc", line=1506, function=<optimized out>) at assert.c:92
      #3  0x0000152141b02fd6 in __GI___assert_fail (assertion=assertion@entry=0x559a854316d5 "m_files.empty()", file=file@entry=0x559a854078f0 "/test/10.4_dbg/storage/innobase/sync/sync0debug.cc", line=line@entry=1506, function=function@entry=0x559a85408a58 "CreateTracker::~CreateTracker()") at assert.c:101
      #4  0x0000559a84da9613 in CreateTracker::~CreateTracker (this=0x559a862caac0 <create_tracker>, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_tree.h:1043
      #5  0x0000152141b158a7 in __run_exit_handlers (status=status@entry=1, listp=0x152141cbb718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
      #6  0x0000152141b15a60 in __GI_exit (status=status@entry=1) at exit.c:139
      #7  0x0000559a844ef6ae in start_signal_handler () at /test/10.4_dbg/sql/mysqld.cc:3176
      #8  mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/mysqld.cc:5802
      #9  0x0000559a844e3b06 in main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/main.cc:25
      

      Attachments

        Issue Links

          Activity

            Please provide a valid rr replay trace for this. I suspect that there is a memory overrun somewhere (likely outside InnoDB), causing srv_sys_space.m_space_id=0 to be overwritten with ~0ULL. The fil_system.sys_space and fil_system.temp_space in the core dump that you provided look reasonable to me.

            marko Marko Mäkelä added a comment - Please provide a valid rr replay trace for this. I suspect that there is a memory overrun somewhere (likely outside InnoDB), causing srv_sys_space.m_space_id=0 to be overwritten with ~0ULL. The fil_system.sys_space and fil_system.temp_space in the core dump that you provided look reasonable to me.

            The CreateTracker was removed in MariaDB Server 10.5.5 by MDEV-22930.

            marko Marko Mäkelä added a comment - The CreateTracker was removed in MariaDB Server 10.5.5 by MDEV-22930 .

            Thank you, I finally got a working trace of this.

            The assertion failure basically complains that create_tracker.register_latch() has been invoked without corresponding create_tracker.deregister_latch(). In the trace that I am looking at, there are 315 elements in create_tracker.m_files. The last one was registered here:

            #4  CreateTracker::register_latch (line=103, filename=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", ptr=0x557170da83f0 <btr_defragment_mutex+16>, this=0x557170da5ac0 <create_tracker>)
                at /test/10.4_dbg/storage/innobase/sync/sync0debug.cc:1530
            #5  sync_file_created_register (ptr=ptr@entry=0x557170da83f0 <btr_defragment_mutex+16>, filename=filename@entry=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", line=line@entry=103)
                at /test/10.4_dbg/storage/innobase/sync/sync0debug.cc:1635
            #6  0x000055716f6cb710 in GenericPolicy<TTASEventMutex<GenericPolicy> >::init (line=103, filename=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", id=LATCH_ID_BTR_DEFRAGMENT_MUTEX, 
                this=0x557170da83f0 <btr_defragment_mutex+16>) at /test/10.4_dbg/storage/innobase/include/sync0types.h:428
            #7  PolicyMutex<TTASEventMutex<GenericPolicy> >::init (line=103, filename=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", id=LATCH_ID_BTR_DEFRAGMENT_MUTEX, 
                this=0x557170da83e0 <btr_defragment_mutex>) at /test/10.4_dbg/storage/innobase/include/ib0mutex.h:662
            #8  mutex_init<PolicyMutex<TTASEventMutex<GenericPolicy> > > (mutex=mutex@entry=0x557170da83e0 <btr_defragment_mutex>, id=id@entry=LATCH_ID_BTR_DEFRAGMENT_MUTEX, 
                file_name=file_name@entry=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", line=line@entry=103) at /test/10.4_dbg/storage/innobase/include/ut0mutex.h:176
            #9  0x000055716f94a4d9 in btr_defragment_init () at /test/10.4_dbg/storage/innobase/btr/btr0defragment.cc:103
            #10 0x000055716f86fd3e in srv_start (create_new_db=<optimized out>) at /test/10.4_dbg/storage/innobase/srv/srv0start.cc:2423
            #11 0x000055716f64f552 in innodb_init (p=<optimized out>) at /test/10.4_dbg/storage/innobase/handler/ha_innodb.cc:4365
            #12 0x000055716f32527c in ha_initialize_handlerton (plugin=0x55717203e250) at /test/10.4_dbg/sql/handler.cc:581
            #13 0x000055716f0b8355 in plugin_do_initialize (plugin=plugin@entry=0x55717203e250, state=@0x7ffc3106f23c: 4) at /test/10.4_dbg/sql/sql_plugin.cc:1429
            #14 0x000055716f0bdcbe in plugin_initialize (tmp_root=tmp_root@entry=0x7ffc3106f6e0, plugin=0x55717203e250, argc=argc@entry=0x55717052d368 <remaining_argc>, argv=argv@entry=0x557172000660, 
                options_only=options_only@entry=false) at /test/10.4_dbg/sql/sql_plugin.cc:1482
            #15 0x000055716f0bf13b in plugin_init (argc=argc@entry=0x55717052d368 <remaining_argc>, argv=<optimized out>, flags=0) at /test/10.4_dbg/sql/sql_plugin.cc:1730
            #16 0x000055716efc41db in init_server_components () at /test/10.4_dbg/sql/mysqld.cc:5233
            #17 0x000055716efc9acc in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/mysqld.cc:5772
            #18 0x000055716efbeb06 in main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/main.cc:25
            

            The reason for this check failing is that an orderly shutdown of all plugins is being skipped here:

            #0  __run_exit_handlers (status=status@entry=1, listp=0x49dd4a22a718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:40
            #1  0x000049dd4a084a60 in __GI_exit (status=status@entry=1) at exit.c:139
            #2  0x000055716efca6ae in start_signal_handler () at /test/10.4_dbg/sql/mysqld.cc:3176
            #3  mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/mysqld.cc:5802
            #4  0x000055716efbeb06 in main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/main.cc:25
            

            In MariaDB Server 10.5 or later, there may not be any assertion failure, but I think it is still wrong to call exit() without making an effort to shut down the storage engine plugins. I have seen unireg_abort() being invoked in similar cases. It would seem to do the correct thing. ramesh, can you test this patch?

            diff --git a/sql/mysqld.cc b/sql/mysqld.cc
            index c110baa9634..34397b068c4 100644
            --- a/sql/mysqld.cc
            +++ b/sql/mysqld.cc
            @@ -3173,7 +3173,7 @@ static void start_signal_handler(void)
               {
                 sql_print_error("Can't create interrupt-thread (error %d, errno: %d)",
             		    error,errno);
            -    exit(1);
            +    unireg_abort(1);
               }
               mysql_cond_wait(&COND_start_thread, &LOCK_start_thread);
               mysql_mutex_unlock(&LOCK_start_thread);
            

            Possibly there are other such calls that need to be replaced.

            marko Marko Mäkelä added a comment - Thank you, I finally got a working trace of this. The assertion failure basically complains that create_tracker.register_latch() has been invoked without corresponding create_tracker.deregister_latch() . In the trace that I am looking at, there are 315 elements in create_tracker.m_files . The last one was registered here: #4 CreateTracker::register_latch (line=103, filename=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", ptr=0x557170da83f0 <btr_defragment_mutex+16>, this=0x557170da5ac0 <create_tracker>) at /test/10.4_dbg/storage/innobase/sync/sync0debug.cc:1530 #5 sync_file_created_register (ptr=ptr@entry=0x557170da83f0 <btr_defragment_mutex+16>, filename=filename@entry=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", line=line@entry=103) at /test/10.4_dbg/storage/innobase/sync/sync0debug.cc:1635 #6 0x000055716f6cb710 in GenericPolicy<TTASEventMutex<GenericPolicy> >::init (line=103, filename=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", id=LATCH_ID_BTR_DEFRAGMENT_MUTEX, this=0x557170da83f0 <btr_defragment_mutex+16>) at /test/10.4_dbg/storage/innobase/include/sync0types.h:428 #7 PolicyMutex<TTASEventMutex<GenericPolicy> >::init (line=103, filename=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", id=LATCH_ID_BTR_DEFRAGMENT_MUTEX, this=0x557170da83e0 <btr_defragment_mutex>) at /test/10.4_dbg/storage/innobase/include/ib0mutex.h:662 #8 mutex_init<PolicyMutex<TTASEventMutex<GenericPolicy> > > (mutex=mutex@entry=0x557170da83e0 <btr_defragment_mutex>, id=id@entry=LATCH_ID_BTR_DEFRAGMENT_MUTEX, file_name=file_name@entry=0x55716fef5888 "/test/10.4_dbg/storage/innobase/btr/btr0defragment.cc", line=line@entry=103) at /test/10.4_dbg/storage/innobase/include/ut0mutex.h:176 #9 0x000055716f94a4d9 in btr_defragment_init () at /test/10.4_dbg/storage/innobase/btr/btr0defragment.cc:103 #10 0x000055716f86fd3e in srv_start (create_new_db=<optimized out>) at /test/10.4_dbg/storage/innobase/srv/srv0start.cc:2423 #11 0x000055716f64f552 in innodb_init (p=<optimized out>) at /test/10.4_dbg/storage/innobase/handler/ha_innodb.cc:4365 #12 0x000055716f32527c in ha_initialize_handlerton (plugin=0x55717203e250) at /test/10.4_dbg/sql/handler.cc:581 #13 0x000055716f0b8355 in plugin_do_initialize (plugin=plugin@entry=0x55717203e250, state=@0x7ffc3106f23c: 4) at /test/10.4_dbg/sql/sql_plugin.cc:1429 #14 0x000055716f0bdcbe in plugin_initialize (tmp_root=tmp_root@entry=0x7ffc3106f6e0, plugin=0x55717203e250, argc=argc@entry=0x55717052d368 <remaining_argc>, argv=argv@entry=0x557172000660, options_only=options_only@entry=false) at /test/10.4_dbg/sql/sql_plugin.cc:1482 #15 0x000055716f0bf13b in plugin_init (argc=argc@entry=0x55717052d368 <remaining_argc>, argv=<optimized out>, flags=0) at /test/10.4_dbg/sql/sql_plugin.cc:1730 #16 0x000055716efc41db in init_server_components () at /test/10.4_dbg/sql/mysqld.cc:5233 #17 0x000055716efc9acc in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/mysqld.cc:5772 #18 0x000055716efbeb06 in main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/main.cc:25 The reason for this check failing is that an orderly shutdown of all plugins is being skipped here: #0 __run_exit_handlers (status=status@entry=1, listp=0x49dd4a22a718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:40 #1 0x000049dd4a084a60 in __GI_exit (status=status@entry=1) at exit.c:139 #2 0x000055716efca6ae in start_signal_handler () at /test/10.4_dbg/sql/mysqld.cc:3176 #3 mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/mysqld.cc:5802 #4 0x000055716efbeb06 in main (argc=<optimized out>, argv=<optimized out>) at /test/10.4_dbg/sql/main.cc:25 In MariaDB Server 10.5 or later, there may not be any assertion failure, but I think it is still wrong to call exit() without making an effort to shut down the storage engine plugins. I have seen unireg_abort() being invoked in similar cases. It would seem to do the correct thing. ramesh , can you test this patch? diff --git a/sql/mysqld.cc b/sql/mysqld.cc index c110baa9634..34397b068c4 100644 --- a/sql/mysqld.cc +++ b/sql/mysqld.cc @@ -3173,7 +3173,7 @@ static void start_signal_handler(void) { sql_print_error("Can't create interrupt-thread (error %d, errno: %d)", error,errno); - exit(1); + unireg_abort(1); } mysql_cond_wait(&COND_start_thread, &LOCK_start_thread); mysql_mutex_unlock(&LOCK_start_thread); Possibly there are other such calls that need to be replaced.

            After applying the patch the server shows {{safe_mutex: Trying to destroy a mutex LOCK_start_thread }} message.

            2023-08-10 15:42:24 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded
            2023-08-10 15:42:24 0 [Warning] Can't create handle_manager thread (errno: 22 "Invalid argument")
            2023-08-10 15:42:24 0 [ERROR] Can't open and lock privilege tables: Thread stack overrun:  3776 bytes used of a -5120 byte stack, and 81920 bytes needed.  Use 'mysqld --thread_stack=#' to specify a bigger stack
            2023-08-10 15:42:24 0 [Note] Server socket created on IP: '::'.
            2023-08-10 15:42:24 0 [ERROR] Can't create interrupt-thread (error 22, errno: 2)
            2023-08-10 15:42:24 0 [ERROR] Aborting
            safe_mutex: Trying to destroy a mutex LOCK_start_thread that was locked at /test/10.4_dbg/sql/mysqld.cc, line 3169 at /test/10.4_dbg/sql/mysqld.cc, line 2107
            

            ramesh Ramesh Sivaraman added a comment - After applying the patch the server shows {{safe_mutex: Trying to destroy a mutex LOCK_start_thread }} message. 2023-08-10 15:42:24 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded 2023-08-10 15:42:24 0 [Warning] Can't create handle_manager thread (errno: 22 "Invalid argument") 2023-08-10 15:42:24 0 [ERROR] Can't open and lock privilege tables: Thread stack overrun: 3776 bytes used of a -5120 byte stack, and 81920 bytes needed. Use 'mysqld --thread_stack=#' to specify a bigger stack 2023-08-10 15:42:24 0 [Note] Server socket created on IP: '::'. 2023-08-10 15:42:24 0 [ERROR] Can't create interrupt-thread (error 22, errno: 2) 2023-08-10 15:42:24 0 [ERROR] Aborting safe_mutex: Trying to destroy a mutex LOCK_start_thread that was locked at /test/10.4_dbg/sql/mysqld.cc, line 3169 at /test/10.4_dbg/sql/mysqld.cc, line 2107

            marko As discussed, after applying following patch it shows memory issue

            diff --git a/sql/mysqld.cc b/sql/mysqld.cc
            index c110baa9634..f646752e539 100644
            --- a/sql/mysqld.cc
            +++ b/sql/mysqld.cc
            @@ -3173,7 +3173,8 @@ static void start_signal_handler(void)
               {
                 sql_print_error("Can't create interrupt-thread (error %d, errno: %d)",
             		    error,errno);
            -    exit(1);
            +    mysql_mutex_unlock(&LOCK_start_thread);
            +    unireg_abort(1);
               }
               mysql_cond_wait(&COND_start_thread, &LOCK_start_thread);
               mysql_mutex_unlock(&LOCK_start_thread);
            

            2023-08-10 16:05:44 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded
            2023-08-10 16:05:44 0 [Warning] Can't create handle_manager thread (errno: 22 "Invalid argument")
            2023-08-10 16:05:44 0 [ERROR] Can't open and lock privilege tables: Thread stack overrun:  3776 bytes used of a -5120 byte stack, and 81920 bytes needed.  Use 'mysqld --thread_stack=#' to specify a bigger stack
            2023-08-10 16:05:44 0 [Note] Server socket created on IP: '::'.
            2023-08-10 16:05:44 0 [ERROR] Can't create interrupt-thread (error 22, errno: 0)
            2023-08-10 16:05:44 0 [ERROR] Aborting
            Warning: Memory not freed: 4328
            

            ramesh Ramesh Sivaraman added a comment - marko As discussed, after applying following patch it shows memory issue diff --git a/sql/mysqld.cc b/sql/mysqld.cc index c110baa9634..f646752e539 100644 --- a/sql/mysqld.cc +++ b/sql/mysqld.cc @@ -3173,7 +3173,8 @@ static void start_signal_handler(void) { sql_print_error("Can't create interrupt-thread (error %d, errno: %d)", error,errno); - exit(1); + mysql_mutex_unlock(&LOCK_start_thread); + unireg_abort(1); } mysql_cond_wait(&COND_start_thread, &LOCK_start_thread); mysql_mutex_unlock(&LOCK_start_thread); 2023-08-10 16:05:44 0 [ERROR] Could not open mysql.plugin table. Some plugins may be not loaded 2023-08-10 16:05:44 0 [Warning] Can't create handle_manager thread (errno: 22 "Invalid argument") 2023-08-10 16:05:44 0 [ERROR] Can't open and lock privilege tables: Thread stack overrun: 3776 bytes used of a -5120 byte stack, and 81920 bytes needed. Use 'mysqld --thread_stack=#' to specify a bigger stack 2023-08-10 16:05:44 0 [Note] Server socket created on IP: '::'. 2023-08-10 16:05:44 0 [ERROR] Can't create interrupt-thread (error 22, errno: 0) 2023-08-10 16:05:44 0 [ERROR] Aborting Warning: Memory not freed: 4328

            People

              sanja Oleksandr Byelkin
              ramesh Ramesh Sivaraman
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.