Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35139

Semaphore wait has lasted > 600 seconds

    XMLWordPrintable

Details

    • Bug
    • Status: Needs Feedback (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.5.26
    • None
    • Locking, Server
    • None
    • Debian 11

    Description

      Hello,

      we are experiencing intermitent crashes/reboots ( ~ 1 / week) on one of our systems. Significant traits are that the system in question is quite busy, with lot of deadlocks as well. Increase of processed data is the only change which can be considered as importatnt to mention, when comparing current and earlier, stable, state.

      mariadb.cfg

      port = 3307
      tmp_disk_table_size = 10737418240  #10G
      max_allowed_packet = 134217728
      wait_timeout = 300
      interactive_timeout = 300
      innodb_buffer_pool_size = 300G
      max_connections = 8000
      thread_cache_size = 8000
      innodb_print_all_deadlocks = 1
      innodb_log_file_size = 32G
      innodb_io_capacity = 1000
      innodb_max_dirty_pages_pct = 50
      lock_wait_timeout = 300
      transaction-isolation = 'READ-COMMITTED'
      open_files_limit = 65535
      expire_logs_days = 4
      max_binlog_size = 1000M
      binlog_format = ROW
      sync_binlog = 0
      binlog_annotate_row_events=0
      slave_parallel_mode = CONSERVATIVE
      log_bin_trust_function_creators = ON
      log_slave_updates = ON
      sync_master_info=0
      sync_relay_log=0
      sync_relay_log_info=0
      replicate_annotate_row_events=0
      innodb_flush_log_at_trx_commit = 2
      gtid_strict_mode = 1
      innodb_stats_persistent = OFF
      innodb_stats_auto_recalc = OFF
      innodb_stats_traditional = OFF
      core_file = 1
      

      error log

      ...
      2024-09-30 11:36:01 0 [Note] InnoDB: A semaphore wait:
      --Thread 139958540924672 has waited at dict0dict.cc line 1094 for 119.00 seconds the semaphore:
      Mutex at 0x55d3ccb631c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
       
      2024-09-30 11:36:01 0 [Note] InnoDB: A semaphore wait:
      --Thread 139969000175360 has waited at row0undo.cc line 412 for 119.00 seconds the semaphore:
      S-lock on RW-latch at 0x55d3ccb631f8 created in file dict0dict.cc line 1047
      a writer (thread id 139958803928832) has reserved it in mode  exclusive
      number of readers 0, waiters flag 1, lock_word: 0
      Last time write locked in file handler0alter.cc line 11265
      2024-09-30 11:36:01 0 [Note] InnoDB: A semaphore wait:
      --Thread 139958496712448 has waited at dict0dict.cc line 1094 for 119.00 seconds the semaphore:
      Mutex at 0x55d3ccb631c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
      ...
      2024-09-30 12:39:17 0 [Note] InnoDB: A semaphore wait:
      --Thread 139971173091072 has waited at dict0dict.cc line 1094 for 31.00 seconds the semaphore:
      Mutex at 0x55d3ccb631c0, Mutex DICT_SYS created ./storage/innobase/dict/dict0dict.cc:1038, lock var 2
      ...
      2024-09-30 12:39:17 0 [Note] InnoDB: A semaphore wait:
      --Thread 139971126847232 has waited at trx0trx.cc line 883 for 546.00 seconds the semaphore:
      Mutex at 0x55d552b97ae0, Mutex REDO_RSEG created ./storage/innobase/trx/trx0rseg.cc:417, lock var 2
      ...
      2024-09-30 12:39:17 0 [Note] InnoDB: A semaphore wait:
      --Thread 139971147429632 has waited at trx0trx.cc line 883 for 575.00 seconds the semaphore:
      Mutex at 0x55d552b97ae0, Mutex REDO_RSEG created ./storage/innobase/trx/trx0rseg.cc:417, lock var 2
       
      2024-09-30 12:39:17 0 [Note] InnoDB: A semaphore wait:
      --Thread 139971330160384 has waited at trx0trx.cc line 883 for 256.00 seconds the semaphore:
      Mutex at 0x55d552b97ae0, Mutex REDO_RSEG created ./storage/innobase/trx/trx0rseg.cc:417, lock var 2
       
      InnoDB: Pending reads 1, writes 0
      2024-09-30 12:39:17 0 [ERROR] [FATAL] InnoDB: Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung.
      240930 12:39:17 [ERROR] mysqld got signal 6 ;
      Sorry, we probably made a mistake, and this is a bug.
       
      Your assistance in bug reporting will enable us to fix this for the next release.
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.
       
      Server version: 10.5.26-MariaDB-0+deb11u2-log source revision: 7a5b8bf0f5470a13094101f0a4bdfa9e1b9ded02
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=4001
      max_threads=4002
      thread_count=4001
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8941918 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      Printing to addr2line failed
      /usr/sbin/mariadbd(my_print_stacktrace+0x2e)[0x55d3cc34abfe]
      /usr/sbin/mariadbd(handle_fatal_signal+0x475)[0x55d3cbe36645]
      /lib/x86_64-linux-gnu/libpthread.so.0
      /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x141)[0x7f99f854fd51]
      /lib/x86_64-linux-gnu/libc.so.6(abort+0x123)[0x7f99f8539537]
      /usr/sbin/mariadbd(+0x65a4ad)[0x55d3cbad84ad]
      /usr/sbin/mariadbd(+0x650d20)[0x55d3cbaced20]
      /usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic13timer_generic7executeEPv+0x38)[0x55d3cc2ec0c8]
      /usr/sbin/mariadbd(_ZN5tpool4task7executeEv+0x32)[0x55d3cc2ed292]
      /usr/sbin/mariadbd(_ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE+0x4f)[0x55d3cc2eaf9f]
      /lib/x86_64-linux-gnu/libstdc++.so.6(+0xceed0)[0x7f99f88ffed0]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7f99f8a0bea7]
      /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f99f8612acf]
      The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mariadbd/ contains
      information that should help you find out what is causing the crash.
      Writing a core file...
      Working directory at /var/lib/mysql-data
      Resource Limits:
      Limit                     Soft Limit           Hard Limit           Units
      Max cpu time              unlimited            unlimited            seconds
      Max file size             unlimited            unlimited            bytes
      Max data size             unlimited            unlimited            bytes
      Max stack size            8388608              unlimited            bytes
      Max core file size        unlimited            unlimited            bytes
      Max resident set          unlimited            unlimited            bytes
      Max processes             2061719              2061719              processes
      Max open files            131070               131070               files
      Max locked memory         65536                65536                bytes
      Max address space         unlimited            unlimited            bytes
      Max file locks            unlimited            unlimited            locks
      Max pending signals       2061719              2061719              signals
      Max msgqueue size         819200               819200               bytes
      Max nice priority         0                    0
      Max realtime priority     0                    0
      Max realtime timeout      unlimited            unlimited            us
      Core pattern: core
       
      Kernel version: Linux version 5.10.0-27-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.205-2 (2023-12-31)
       
      2024-09-30 12:40:28 0 [Note] Starting MariaDB 10.5.26-MariaDB-0+deb11u2-log source revision 7a5b8bf0f5470a13094101f0a4bdfa9e1b9ded02 server_uid Bf2sgYgSusKDXowmzo3Sx8bpGj4= as process 3945280
      

      There is also 25G core dump, though I am hesitant to upload it considering its size and potentially sensitive information.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              sumark Marek Hlavka
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.