Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31049

fil_delete_tablespace() returns wrong file handle if tablespace was closed by parallel thread

    XMLWordPrintable

Details

    Description

      trx_t::commit(std::vector<pfs_os_file_t> &deleted) invokes fil_delete_tablespace()->fil_space_free_low() for each modified space:

      void trx_t::commit(std::vector<pfs_os_file_t> &deleted)                                                                           
      {                                                                               
      ...                                                                             
          for (const auto &p : mod_tables)                                            
          {                                                                           
            if (p.second.is_dropped())                                                
            {                                                                         
            ...                                                                       
              if (const auto id= space ? space->id : 0)                               
              {                                                                       
                pfs_os_file_t d= fil_delete_tablespace(id);                           
                if (d != OS_FILE_CLOSED)                                              
                  deleted.emplace_back(d);                                            
              }                                                                       
            }                                                                         
          }                                                                           
      ...                                                                             
      }
      

      and collects file handles in "deleted " array. Then ha_innobase::delete_table() closes files handles in "deleted" array:

      int ha_innobase::delete_table(const char *name)                                                                        
      {                                                                               
      ...                                                                                                      
        std::vector<pfs_os_file_t> deleted;                                           
        trx->commit(deleted);                                                         
      ...                                                                             
        row_mysql_unlock_data_dictionary(trx);                                        
        for (pfs_os_file_t d : deleted)                                               
          os_file_close(d);                                                           
      ...                                                                             
      }
      

      Consider fil_delete_tablespace() function, which returns file handles for "delete" array:

      pfs_os_file_t fil_delete_tablespace(ulint id)                                   
      {                                                                               
        ut_ad(!is_system_tablespace(id));                                             
        pfs_os_file_t handle= OS_FILE_CLOSED;                                         
        if (fil_space_t *space= fil_space_t::check_pending_operations(id))            
        {                                                                             
          /* Before deleting the file(s), persistently write a log record. */         
          mtr_t mtr;                                                                  
          mtr.start();                                                                
          mtr.log_file_op(FILE_DELETE, id, space->chain.start->name);                 
          handle= space->chain.start->handle;                                        
          mtr.commit_file(*space, nullptr);                                           
                                                                                      
          fil_space_free_low(space);                                                  
        }                                                                             
                                                                                      
        ibuf_delete_for_discarded_space(id);                                          
        return handle;                                                                
      }
      

      fil_system_t::detach() is invoked from mtr_t::commit_file(). But during fil_delete_tablespace() execution buf_do_LRU_batch() can close the tablespace between "handle= space->chain.start->handle;" and "return handle;" lines. It can do this with the following stack:

      #1  0x000055fd4a0caf59 in os_file_close_func (file=15) at ./storage/innobase/os/os0file.cc:1452
      #2  0x000055fd4a319d0d in fil_node_t::close (this=0x5c8e34110e60) at ./storage/innobase/fil/fil0fil.cc:453
      #3  0x000055fd4a318e20 in fil_space_t::try_to_close (print_info=false)          
          at ./storage/innobase/fil/fil0fil.cc:124                                    
      #4  0x000055fd4a319bf9 in fil_node_open_file (node=0x5c8e34037e30) at ./storage/innobase/fil/fil0fil.cc:422
      #5  0x000055fd4a31adb2 in fil_space_t::prepare_acquired (this=0x5c8e34037cf0)   
          at ./storage/innobase/fil/fil0fil.cc:656                                    
      #6  0x000055fd4a31e907 in fil_space_t::get (id=67) at ./storage/innobase/fil/fil0fil.cc:1482
      #7  0x000055fd4a2af3a4 in buf_flush_space (id=67) at ./storage/innobase/buf/buf0flu.cc:1186
      #8  0x000055fd4a2afb72 in buf_flush_LRU_list_batch (max=2000, evict=false, n=0x326e0a09cbf0)
          at ./storage/innobase/buf/buf0flu.cc:1293                                   
      #9  0x000055fd4a2b0002 in buf_do_LRU_batch (max=2000, evict=false, n=0x326e0a09cbf0)
          at ./storage/innobase/buf/buf0flu.cc:1362                                   
      #10 0x000055fd4a2b1597 in buf_flush_LRU (max_n=2000, evict=false) at ./storage/innobase/buf/buf0flu.cc:1708
      #11 0x000055fd4a2b441f in buf_flush_page_cleaner () at ./storage/innobase/buf/buf0flu.cc:2310
      

      So "space->chain.start->handle" can be set to -1 in parallel thread, but fil_delete_tablespace() returns the old value, saved in local "handle" variable.

      fil_space_t::try_to_close() is executed under fil_system.mutex. And mtr_t::commit_file() locks it for fil_system_t::detach() call. fil_system_t::detach() returns detached file handle if its argument detach_handle is true. The fix is to let mtr_t::commit_file() to pass that detached file handle to fil_delete_tablespace().

      Attachments

        Issue Links

          Activity

            People

              vlad.lesin Vladislav Lesin
              vlad.lesin Vladislav Lesin
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.