Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24612

innodb hangs if it's initialization is broken before encryption threads are started

    XMLWordPrintable

Details

    Description

      If innodb can't be initialized for some reason innodb_init() invokes srv_shutdown_threads(), which sets srv_shutdown_state = SRV_SHUTDOWN_EXIT_THREADS or innodb_preshutdown() which sets srv_shutdown_state = SRV_SHUTDOWN_INITIATED. The call stack for 10.5 is the following:

      ▾ fil_crypt_threads_init                                                        
        ▾ fil_crypt_set_thread_cnt                                                    
          ▾ srv_shutdown_threads                                                      
            ▾ srv_init_abort_low                                                      
              ▾ srv_start                                                             
               ▸ innodb_init                                                          
          ▾ innodb_preshutdown                                                        
            • innodb_init     
      

      fil_crypt_set_thread_cnt() invokes fil_crypt_threads_init(), which in turns, invokes fil_crypt_set_thread_cnt(srv_n_fil_crypt_threads) again to start srv_n_fil_crypt_threads threads. Encryption threads are terminated if srv_shutdown_state != SRV_SHUTDOWN_NONE(see fil_crypt_thread()), and fil_crypt_set_thread_cnt() waits while the threads are started infinitely:

      #0  0x00007ffff6e13065 in futex_abstimed_wait_cancelable (
          private=<optimized out>, abstime=0x7fffffff7f40, expected=0, 
          futex_word=0x5555586e6f80)
          at ../sysdeps/unix/sysv/linux/futex-internal.h:205
      #1  __pthread_cond_wait_common (abstime=0x7fffffff7f40, mutex=0x5555586e6f30, 
          cond=0x5555586e6f58) at pthread_cond_wait.c:539
      #2  __pthread_cond_timedwait (cond=0x5555586e6f58, mutex=0x5555586e6f30, 
          abstime=0x7fffffff7f40) at pthread_cond_wait.c:667
      #3  0x000055555673326a in os_event::timed_wait (this=0x5555586e6f18, 
          abstime=0x7fffffff7f40)
          at ./storage/innobase/os/os0event.cc:275
      #4  0x000055555673352a in os_event::wait_time_low (this=0x5555586e6f18, 
          time_in_usec=100000, reset_sig_count=9)
          at ./storage/innobase/os/os0event.cc:385
      #5  0x000055555673371e in os_event_wait_time_low (event=0x5555586e6f18, 
          time_in_usec=100000, reset_sig_count=0)
          at ./storage/innobase/os/os0event.cc:485
      #6  0x00005555569d91de in fil_crypt_set_thread_cnt (new_cnt=4)
          at ./storage/innobase/fil/fil0crypt.cc:2242
      #7  0x00005555569d9604 in fil_crypt_threads_init ()
          at ./storage/innobase/fil/fil0crypt.cc:2362
      #8  0x00005555569d9003 in fil_crypt_set_thread_cnt (new_cnt=0)
          at ./storage/innobase/fil/fil0crypt.cc:2219
      #9  0x0000555556857009 in srv_shutdown_threads ()
          at ./storage/innobase/srv/srv0start.cc:839
      #10 0x000055555685724d in srv_init_abort_low (create_new_db=false, 
          file=0x555556ff09b0 "./storage/innobase/srv/srv0start.cc", line=1495, err=DB_CORRUPTION)
          at ./storage/innobase/srv/srv0start.cc:887
      #11 0x00005555568590cf in srv_start (create_new_db=false)
          at ./storage/innobase/srv/srv0start.cc:1495
      #12 0x000055555661f491 in innodb_init (p=0x555558542918)
      

      How to repeat:
      Cause srv_init_abort_low() call from srv_start().

      How to fix:
      Do not init encryption threads if shutdown is in progress:

      --- a/storage/innobase/fil/fil0crypt.cc
      +++ b/storage/innobase/fil/fil0crypt.cc
      @@ -2216,6 +2216,8 @@ fil_crypt_set_thread_cnt(
              const uint      new_cnt)
       {
              if (!fil_crypt_threads_inited) {
      +               if (srv_shutdown_state != SRV_SHUTDOWN_NONE)
      +                       return;
                      fil_crypt_threads_init();
              }
      
      

      Attachments

        Issue Links

          Activity

            People

              vlad.lesin Vladislav Lesin
              vlad.lesin Vladislav Lesin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.