Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11972

Assertion failure in sync_array_get_and_reserve_cell()

Details

    Description

      MariaDB 10.1.19 crashed with exception 0x80000003 and out of memory

      2017-01-28 07:09:28 561c  InnoDB: Assertion failure in thread 22044 in file sync0arr.ic line 60
      InnoDB: Failing assertion: reserved
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
      InnoDB: about forcing recovery.
      170128  7:09:28 [ERROR] mysqld got exception 0x80000003 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.1.19-MariaDB
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=21
      max_threads=1001
      thread_count=13621
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2327258 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0xba42928018
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      

      mysqld.exe!my_parameter_handler()[my_init.c:259]
      mysqld.exe!raise()[winsig.c:587]
      mysqld.exe!abort()[abort.c:82]
      mysqld.exe!sync_array_get_and_reserve_cell()[sync0arr.ic:60]
      mysqld.exe!mutex_spin_wait()[sync0sync.cc:631]
      mysqld.exe!dict_sys_get_new_row_id()[dict0boot.ic:38]
      mysqld.exe!row_ins()[row0ins.cc:3295]
      mysqld.exe!row_ins_step()[row0ins.cc:3444]
      mysqld.exe!row_insert_for_mysql()[row0mysql.cc:1921]
      mysqld.exe!ha_innobase::write_row()[ha_innodb.cc:8996]
      mysqld.exe!handler::ha_write_row()[handler.cc:5916]
      mysqld.exe!write_record()[sql_insert.cc:1882]
      mysqld.exe!select_insert::send_data()[sql_insert.cc:3704]
      mysqld.exe!end_send()[sql_select.cc:19450]
      mysqld.exe!evaluate_join_record()[sql_select.cc:18540]
      mysqld.exe!sub_select()[sql_select.cc:18314]
      mysqld.exe!evaluate_join_record()[sql_select.cc:18540]
      mysqld.exe!sub_select()[sql_select.cc:18353]
      mysqld.exe!do_select()[sql_select.cc:17969]
      mysqld.exe!JOIN::exec_inner()[sql_select.cc:3215]
      mysqld.exe!JOIN::exec()[sql_select.cc:2506]
      mysqld.exe!mysql_select()[sql_select.cc:3440]
      mysqld.exe!handle_select()[sql_select.cc:384]
      mysqld.exe!mysql_execute_command()[sql_parse.cc:4001]
      mysqld.exe!sp_instr_stmt::exec_core()[sp_head.cc:3217]
      mysqld.exe!sp_lex_keeper::reset_lex_and_exec_core()[sp_head.cc:2981]
      mysqld.exe!sp_instr_stmt::execute()[sp_head.cc:3133]
      mysqld.exe!sp_head::execute()[sp_head.cc:1319]
      mysqld.exe!sp_head::execute_procedure()[sp_head.cc:2106]
      mysqld.exe!do_execute_sp()[sql_parse.cc:2406]
      mysqld.exe!mysql_execute_command()[sql_parse.cc:5252]
      mysqld.exe!sp_instr_stmt::exec_core()[sp_head.cc:3217]
      mysqld.exe!sp_lex_keeper::reset_lex_and_exec_core()[sp_head.cc:2981]
      mysqld.exe!sp_instr_stmt::execute()[sp_head.cc:3133]
      mysqld.exe!sp_head::execute()[sp_head.cc:1319]
      mysqld.exe!sp_head::execute_procedure()[sp_head.cc:2106]
      mysqld.exe!Event_job_data::execute()[event_data_objects.cc:1433]
      mysqld.exe!Event_worker_thread::run()[event_scheduler.cc:325]
      mysqld.exe!event_worker_thread()[event_scheduler.cc:281]
      mysqld.exe!pthread_start()[my_winthread.c:62]
      mysqld.exe!_callthreadstartex()[threadex.c:376]
      mysqld.exe!_threadstartex()[threadex.c:354]
      KERNEL32.DLL!BaseThreadInitThunk()
      ntdll.dll!RtlUserThreadStart()
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      

      Query (0xc248bc7bf0): INSERT INTO tableau_reporting.v1_dm7_pdu_90days
            SELECT 'V1',
                   'DM7',
                   'A6',
                   v1_dm7_pdu_a6_ndx,
                   PDU_Input_Freq,
                   PDU_Output_kW_Total,
                   PDU_Phase_A_Input_Volts,
                   PDU_Phase_A_Output_Volts,
                   PDU_Phase_B_Input_Volts,
                   PDU_Phase_B_Output_Volts,
                   PDU_Phase_C_Input_Volts,
                   PDU_Phase_C_Output_Volts,
                   if(
                      second(p.t_stamp) < 30,
                      CONVERT(DATE_FORMAT(p.t_stamp, '%Y-%m-%d-%H:%i:00'),
                              DATETIME),
                      CONVERT(DATE_FORMAT(p.t_stamp, '%Y-%m-%d-%H:%i:30'),
                              DATETIME))
                      AS time_stamp,
                   p.t_stamp
            FROM v1_ignition.v1_dm7_pdu_a6 p, tableau_reporting.temp110 t
            WHERE T.T_STAMP = p.T_STAMP
      Connection ID (thread ID): 49951
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
       
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      

      2017-01-28  7:10:22 5940 [ERROR] Event_scheduler::execute_top: Can not create event worker thread (errno=22). Stopping event scheduler
      2017-01-28  7:10:22 58016 [ERROR] mysqld.exe: Out of memory (Needed 65216 bytes)
      2017-01-28  7:10:22 58388 [ERROR] mysqld.exe: Out of memory (Needed 48912 bytes)
      2017-01-28  7:10:22 58016 [ERROR] Out of memory; check if mysqld or some other process uses all available memory; if not, you may have to use 'ulimit' to allow mysqld to use more memory or you can add more swap space
      2017-01-28  7:10:22 58388 [ERROR] Out of memory; check if mysqld or some other process uses all available memory; if not, you may have to use 'ulimit' to allow mysqld to use more memory or you can add more swap space
      

      Attachments

        Issue Links

          Activity

            I think that the algorithm should be changed to loop over all sync_wait_array[] until an available slot is found.

            There is a theoretical possibility that the sync_wait_array[] runs out of slots when there is enough contention between the threads in the server, or if multiple threads are holding a large number of rw-locks or mutexes, such as when multiple threads are splitting (different) b-trees concurrently.

            marko Marko Mäkelä added a comment - I think that the algorithm should be changed to loop over all sync_wait_array[] until an available slot is found. There is a theoretical possibility that the sync_wait_array[] runs out of slots when there is enough contention between the threads in the server, or if multiple threads are holding a large number of rw-locks or mutexes, such as when multiple threads are splitting (different) b-trees concurrently.

            The same problem exists in MariaDB 10.2 as well. Only in the merge of MySQL 5.7.9 the assertion was changed to ut_a(*cell != NULL) just like it was changed by the MySQL 5.7 version of the Oracle change.

            marko Marko Mäkelä added a comment - The same problem exists in MariaDB 10.2 as well. Only in the merge of MySQL 5.7.9 the assertion was changed to ut_a(*cell != NULL) just like it was changed by the MySQL 5.7 version of the Oracle change.

            On a related note, at least the following functions (or their callers) seem to be broken when sync_array_size == 1 does not hold:
            sync_arr_get_n_items(), sync_arr_get_item(), sync_arr_fill_sys_semphore_waits_table(), sync_array_print_innodb().

            I think that we should try to get rid of the sync_array altogether.

            marko Marko Mäkelä added a comment - On a related note, at least the following functions (or their callers) seem to be broken when sync_array_size == 1 does not hold: sync_arr_get_n_items(), sync_arr_get_item(), sync_arr_fill_sys_semphore_waits_table(), sync_array_print_innodb(). I think that we should try to get rid of the sync_array altogether.

            One more flaw in sync_array_get_and_reserve_cell() is that if srv_sync_array_size==0, it would return an uninitialized value. This will cause GCC 7.1.0 to emit warnings for the callers of this function when building with -O3 (several warnings like the following):

            storage/innobase/sync/sync0rw.cc: In function 'void rw_lock_s_lock_spin(rw_lock_t*, ulint, const char*, ulint)':
            storage/innobase/sync/sync0rw.cc:420:24: warning: 'sync_arr' may be used uninitialized in this function [-Wmaybe-uninitialized]
            

            marko Marko Mäkelä added a comment - One more flaw in sync_array_get_and_reserve_cell() is that if srv_sync_array_size==0, it would return an uninitialized value. This will cause GCC 7.1.0 to emit warnings for the callers of this function when building with -O3 (several warnings like the following): storage/innobase/sync/sync0rw.cc: In function 'void rw_lock_s_lock_spin(rw_lock_t*, ulint, const char*, ulint)': storage/innobase/sync/sync0rw.cc:420:24: warning: 'sync_arr' may be used uninitialized in this function [-Wmaybe-uninitialized]

            I do not think that we have fixed anything directly in this area.
            The original report cannot be duplicating MDEV-13983, because that deadlock was introduced later, in 10.1.26.
            Is this still repeatable? If this had occurred in our internal testing, I would have expected updates to this ticket. If this is repeatable in an environment where https://rr-project.org/ runs, it should be possible to find the exact cause and fix it.

            marko Marko Mäkelä added a comment - I do not think that we have fixed anything directly in this area. The original report cannot be duplicating MDEV-13983 , because that deadlock was introduced later, in 10.1.26. Is this still repeatable? If this had occurred in our internal testing, I would have expected updates to this ticket. If this is repeatable in an environment where https://rr-project.org/ runs, it should be possible to find the exact cause and fix it.

            People

              marko Marko Mäkelä
              Marappa Ganesan
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.