Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-11843

MariaDB Server Fail - "Has Gone Away" & fail to restart

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Duplicate
    • 10.2.3
    • 10.2.5, 10.3.0
    • LAMP Stack: Ubuntu Trusty 14.04 + PHP FPM 7.1 + Apache 2 + MariaDB 10.2

    Description

      All of a sudden, without any particular operation being carried out, error "Mysql server has gone away". Trying to restart the server, the attached error is shown.

      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 0x7fdcf43fc700  InnoDB: Assertion failure in thread 140586967353088 in file trx0purge.cc line 168
      Jan 19 16:59:07 devmachine mysqld: InnoDB: Failing assertion: purge_sys->iter.trx_no <= purge_sys->rseg->last_trx_no
      Jan 19 16:59:07 devmachine mysqld: InnoDB: We intentionally generate a memory trap.
      Jan 19 16:59:07 devmachine mysqld: InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
      Jan 19 16:59:07 devmachine mysqld: InnoDB: If you get repeated assertion failures or crashes, even
      Jan 19 16:59:07 devmachine mysqld: InnoDB: immediately after the mysqld startup, there may be
      Jan 19 16:59:07 devmachine mysqld: InnoDB: corruption in the InnoDB tablespace. Please refer to
      Jan 19 16:59:07 devmachine mysqld: InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
      Jan 19 16:59:07 devmachine mysqld: InnoDB: about forcing recovery.
      Jan 19 16:59:07 devmachine mysqld: 170119 16:59:07 [ERROR] mysqld got signal 6 ;
      Jan 19 16:59:07 devmachine mysqld: This could be because you hit a bug. It is also possible that this binary
      Jan 19 16:59:07 devmachine mysqld: or one of the libraries it was linked against is corrupt, improperly built,
      Jan 19 16:59:07 devmachine mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
      Jan 19 16:59:07 devmachine mysqld:
      Jan 19 16:59:07 devmachine mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
      Jan 19 16:59:07 devmachine mysqld:
      Jan 19 16:59:07 devmachine mysqld: We will try our best to scrape up some info that will hopefully help
      Jan 19 16:59:07 devmachine mysqld: diagnose the problem, but since we have already crashed,
      Jan 19 16:59:07 devmachine mysqld: something is definitely wrong and this may fail.
      Jan 19 16:59:07 devmachine mysqld:
      Jan 19 16:59:07 devmachine mysqld: Server version: 10.2.3-MariaDB-10.2.3+maria~trusty
      Jan 19 16:59:07 devmachine mysqld: key_buffer_size=16777216
      Jan 19 16:59:07 devmachine mysqld: read_buffer_size=131072
      Jan 19 16:59:07 devmachine mysqld: max_used_connections=0
      Jan 19 16:59:07 devmachine mysqld: max_threads=153
      Jan 19 16:59:07 devmachine mysqld: thread_count=5
      Jan 19 16:59:07 devmachine mysqld: It is possible that mysqld could use up to
      Jan 19 16:59:07 devmachine mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352511 K  bytes of memory
      Jan 19 16:59:07 devmachine mysqld: Hope that's ok; if not, decrease some variables in the equation.
      Jan 19 16:59:07 devmachine mysqld:
      Jan 19 16:59:07 devmachine mysqld: Thread pointer: 0x0x7fdd13ff6008
      Jan 19 16:59:07 devmachine mysqld: Attempting backtrace. You can use the following information to find out
      Jan 19 16:59:07 devmachine mysqld: where mysqld died. If you see no messages after this, something went
      Jan 19 16:59:07 devmachine mysqld: terribly wrong...
      Jan 19 16:59:07 devmachine mysqld: stack_bottom = 0x7fdcf43fbcc0 thread_stack 0x40000
      Jan 19 16:59:07 devmachine mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7fdd3304a67e]
      Jan 19 16:59:07 devmachine mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x2e5)[0x7fdd32a86b25]
      Jan 19 16:59:07 devmachine mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7fdd30f3e340]
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Note] InnoDB: 5.7.14 started; log sequence number 4528342335
      Jan 19 16:59:07 devmachine mysqld: /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x39)[0x7fdd30391cc9]
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140586921228032 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Note] Plugin 'FEEDBACK' is disabled.
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Note] Recovering after a crash using tc.log
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Note] Starting crash recovery...
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Note] Crash recovery finished.
      Jan 19 16:59:07 devmachine mysqld: /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fdd303950d8]
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140586921228032 [Note] InnoDB: Buffer pool(s) load completed at 170119 16:59:07
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Warning] Failed to setup SSL
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Warning] SSL error: SSL_CTX_set_default_verify_paths failed
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Warning] SSL error: error:02001002:system library:fopen:No such file or directory
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Warning] SSL error: error:2006D080:BIO routines:BIO_new_file:no such file
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Warning] SSL error: error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Note] Server socket created on IP: '127.0.0.1'.
      Jan 19 16:59:07 devmachine mysqld: /usr/sbin/mysqld(+0x414c7c)[0x7fdd3284bc7c]
      Jan 19 16:59:07 devmachine mysqld: 2017-01-19 16:59:07 140588027934656 [Note] /usr/sbin/mysqld: ready for connections.
      Jan 19 16:59:07 devmachine mysqld: Version: '10.2.3-MariaDB-10.2.3+maria~trusty'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
      Jan 19 16:59:07 devmachine mysqld: /usr/sbin/mysqld(+0x97455e)[0x7fdd32dab55e]
      Jan 19 16:59:07 devmachine mysqld: /usr/sbin/mysqld(+0x974740)[0x7fdd32dab740]
      Jan 19 16:59:07 devmachine mysqld: /usr/sbin/mysqld(+0x978078)[0x7fdd32daf078]
      Jan 19 16:59:07 devmachine mysqld: /usr/sbin/mysqld(+0x94eb85)[0x7fdd32d85b85]
      Jan 19 16:59:07 devmachine mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7fdd30f36182]
      Jan 19 16:59:07 devmachine mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fdd3045547d]
      Jan 19 16:59:07 devmachine mysqld:
      Jan 19 16:59:07 devmachine mysqld: Trying to get some variables.
      Jan 19 16:59:07 devmachine mysqld: Some pointers may be invalid and cause the dump to abort.
      Jan 19 16:59:07 devmachine mysqld: Query (0x0): is an invalid pointer
      Jan 19 16:59:07 devmachine mysqld: Connection ID (thread ID): 2
      Jan 19 16:59:07 devmachine mysqld: Status: NOT_KILLED
      Jan 19 16:59:07 devmachine mysqld:
      Jan 19 16:59:07 devmachine mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on
      

      Attachments

        Issue Links

          Activity

            Hi Fabio,
            The interesting part of the log is missing: what happened before the first failed restart? I would be interested in that part of the server history.

            The cause of the startup failure seems to be some form of undo log corruption. There is another issue MDEV-11044 that is also caused by undo log corruption, but we do not know what caused the corruption in the first place.

            Please back up all the files related to this incident. Can you start the server under gdb, and then get a stack trace from it? Or can you enable core dumps and show the stack trace from the core dump? That would get us started. I would then instruct you to extract the corrupted page from the buffer pool.

            marko Marko Mäkelä added a comment - Hi Fabio, The interesting part of the log is missing: what happened before the first failed restart? I would be interested in that part of the server history. The cause of the startup failure seems to be some form of undo log corruption. There is another issue MDEV-11044 that is also caused by undo log corruption, but we do not know what caused the corruption in the first place. Please back up all the files related to this incident. Can you start the server under gdb, and then get a stack trace from it? Or can you enable core dumps and show the stack trace from the core dump? That would get us started. I would then instruct you to extract the corrupted page from the buffer pool.

            Note there is another bug report MDEV-11894 with the same assertion failure.

            elenst Elena Stepanova added a comment - Note there is another bug report MDEV-11894 with the same assertion failure.
            elenst Elena Stepanova added a comment - - edited

            marko wrote:

            I will close MDEV-11843 similarly, if @elenst can confirm that neither MDEV-11927 nor MDEV-11843 have been observed since 10.2.5.

            I haven't found in buildbot records any searchable indication of either MDEV-9663 or MDEV-11927 or MDEV-11843 other than two occurrences of InnoDB: tried to purge non-delete-marked record in index, both of which happened on 10.2 prior to 10.2.5. Of course, buildbot only stores the output, which, as we know, does not always contain details of the crashes, but we don't have anything else to search for, so I think we can consider it fixed for now (and since there were no commits in scope of this bug, we can re-open it later if the problem re-surfaces).

            elenst Elena Stepanova added a comment - - edited marko wrote: I will close MDEV-11843 similarly, if @elenst can confirm that neither MDEV-11927 nor MDEV-11843 have been observed since 10.2.5. I haven't found in buildbot records any searchable indication of either MDEV-9663 or MDEV-11927 or MDEV-11843 other than two occurrences of InnoDB: tried to purge non-delete-marked record in index , both of which happened on 10.2 prior to 10.2.5. Of course, buildbot only stores the output, which, as we know, does not always contain details of the crashes, but we don't have anything else to search for, so I think we can consider it fixed for now (and since there were no commits in scope of this bug, we can re-open it later if the problem re-surfaces).

            People

              marko Marko Mäkelä
              fabdan Fabio Daniele
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.