Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13267

At startup with crash recovery: mtr_t::commit_checkpoint(lsn_t, bool): Assertion `!recv_no_log_write' failed

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/qa-kvm-linux/builds/1658/steps/stable_tests/logs/stdio

      tokudb_bugs.xa-3 'innodb'                w2 [ fail ]
              Test ended at 2017-07-05 13:24:51
       
      CURRENT_TEST: tokudb_bugs.xa-3
       
       
      Server [mysqld.1 - pid: 2997, winpid: 2997, exit: 256] failed during test run
      Server log from this test:
      ----------SERVER LOG START-----------
      SIGKILL myself
      2017-07-05 13:24:49 140090411992896 [Note] /home/buildbot/build/sql/mysqld (mysqld 10.2.7-MariaDB-debug-log) starting as process 2998 ...
      2017-07-05 13:24:49 140090411992896 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4162)
      2017-07-05 13:24:49 140090411992896 [Note] Plugin 'SEQUENCE' is disabled.
      Wed Jul  5 13:24:49 2017 PerconaFT recovery starting in env /home/buildbot/build/mysql-test/var/2/mysqld.1/data/
      Wed Jul  5 13:24:49 2017 PerconaFT recovery scanning backward from 545
      Wed Jul  5 13:24:49 2017 PerconaFT recovery bw_end_checkpoint at 12 timestamp 1499275474701350 xid 8 (bw_newer)
      Wed Jul  5 13:24:49 2017 PerconaFT recovery bw_begin_checkpoint at 8 timestamp 1499275474700634 (bw_between)
      Wed Jul  5 13:24:49 2017 PerconaFT recovery turning around at begin checkpoint 8 time 716
      Wed Jul  5 13:24:49 2017 PerconaFT recovery starts scanning forward to 545 from 8 left 537 (fw_between)
      Wed Jul  5 13:24:49 2017 PerconaFT recovery closing 2 dictionaries
      Wed Jul  5 13:24:49 2017 PerconaFT recovery making a checkpoint
      Wed Jul  5 13:24:49 2017 PerconaFT recovery done
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Uses event mutexes
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Compressed tables use zlib 1.2.8
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Using Linux native AIO
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Number of pools: 1
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Using generic crc32 instructions
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Initializing buffer pool, total size = 8M, instances = 1, chunk size = 8M
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Completed initialization of buffer pool
      2017-07-05 13:24:49 140089241872128 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Highest supported file format is Barracuda.
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Starting crash recovery from checkpoint LSN=2785956
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Ignoring data file './test/t1.ibd' with space ID 14, since the redo log references ./test/t1.ibd with space ID 7.
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Ignoring data file './test/t1.ibd' with space ID 14, since the redo log references ./test/t1.ibd with space ID 13.
      2017-07-05 13:24:49 140090411992896 [Note] InnoDB: Starting a batch to recover 179 pages from redo log.
      mysqld: /home/buildbot/build/storage/innobase/mtr/mtr0mtr.cc:603: void mtr_t::commit_checkpoint(lsn_t, bool): Assertion `!recv_no_log_write' failed.
      170705 13:24:50 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.2.7-MariaDB-debug-log
      key_buffer_size=1048576
      read_buffer_size=131072
      max_used_connections=0
      max_threads=153
      thread_count=0
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63068 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      /home/buildbot/build/sql/mysqld(my_print_stacktrace+0x38)[0x7f6956b6c3c8]
      /home/buildbot/build/sql/mysqld(handle_fatal_signal+0x3a3)[0x7f6956404810]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f69557bd890]
      /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f6953af3067]
      /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f6953af4448]
      /lib/x86_64-linux-gnu/libc.so.6(+0x2e266)[0x7f6953aec266]
      /lib/x86_64-linux-gnu/libc.so.6(+0x2e312)[0x7f6953aec312]
      /home/buildbot/build/sql/mysqld(+0xac51ae)[0x7f69566b31ae]
      /home/buildbot/build/sql/mysqld(+0xd21f62)[0x7f695690ff62]
      /home/buildbot/build/sql/mysqld(+0xab8f30)[0x7f69566a6f30]
      /home/buildbot/build/sql/mysqld(+0xbcef01)[0x7f69567bcf01]
      /home/buildbot/build/sql/mysqld(+0xa0f039)[0x7f69565fd039]
      /home/buildbot/build/sql/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0xfc)[0x7f6956406849]
      mysys/stacktrace.c:267(my_print_stacktrace)[0x7f69561b14f8]
      /home/buildbot/build/sql/mysqld(_Z11plugin_initPiPPci+0x912)[0x7f69561b2101]
      sql/signal_handler.cc:168(handle_fatal_signal)[0x7f69560bef35]
      /home/buildbot/build/sql/mysqld(_Z11mysqld_mainiPPc+0x5f4)[0x7f69560bffba]
      fil/fil0fil.cc:6580(fil_names_clear(unsigned long, bool))[0x7f69560b4ee0]
      /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f6953adfb45]
      log/log0recv.cc:3351(recv_recovery_from_checkpoint_start(unsigned long))[0x7f69560b4db9]
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      Writing a core file
      ----------SERVER LOG END-------------
      mysqltest failed but provided no output
      The result from queries just before the failure was:
      < snip >
      drop table if exists t1, t2;
      CREATE TABLE t1 (a INT PRIMARY KEY) ENGINE=innodb;
      CREATE TABLE t2 (b INT PRIMARY KEY) ENGINE=tokudb;
      BEGIN;
      SELECT @@log_bin;
      @@log_bin
      0
      INSERT INTO t1 VALUES (1);
      INSERT INTO t2 VALUES (2);
      SET SESSION debug_dbug="d,crash_commit_before";
      COMMIT;
      ERROR HY000: Lost connection to MySQL server during query
       
       
       
       - saving '/home/buildbot/build/mysql-test/var/2/log/tokudb_bugs.xa-3-innodb/' to '/home/buildbot/build/mysql-test/var/log/tokudb_bugs.xa-3-innodb/'
      worker[2] > Restart  - not started
      ***Warnings generated in error logs during shutdown after running tests: tokudb_bugs.xa-3
      

      I also had the same assertion failure with tokudb_bugs.xa-4 on a local debug build.

      Attachments

        Activity

          There were no server logs for the reported failure. Luckily, I ran into this very same assertion failure when testing a merge into 10.3 locally. For me, it failed for two tests that performed InnoDB crash recovery:
          innodb.innodb-alter-tempfile
          gcol.gcol_rollback

          In all the 3 crashes (the first test also with innodb_page_size=4k), the stack trace was like this:

          #8  0x000055fa2f7d41f6 in mtr_t::commit_checkpoint (this=0x7ffe265635f0, 
              checkpoint_lsn=732924, write_mlog_checkpoint=true)
              at /mariadb/10.3/storage/innobase/mtr/mtr0mtr.cc:603
          #9  0x000055fa2fa5ab92 in fil_names_clear (lsn=732924, do_write=true)
              at /mariadb/10.3/storage/innobase/fil/fil0fil.cc:6580
          #10 0x000055fa2f7c6396 in recv_recovery_from_checkpoint_start (
              flush_lsn=732225) at /mariadb/10.3/storage/innobase/log/log0recv.cc:3412
          #11 0x000055fa2f8f2d99 in innobase_start_or_create_for_mysql ()
              at /mariadb/10.3/storage/innobase/srv/srv0start.cc:2197
          

          When it comes to the MariaDB server, this is a bogus debug assertion failure. While generating page-change redo log records is strictly out of the question during tat certain parts of crash recovery, the fil_names_clear() is only emitting informational MLOG_FILE_NAME records to guarantee that if the server is killed during or soon after the crash recovery, the subsequent crash recovery will succeed.

          The metadata buffer that fil_names_clear() is flushing to the redo log is being filled by recv_init_crash_recovery_spaces(), right before starting to apply redo log, by invoking fil_names_dirty() on every dirty tablespace that was found.

          When it comes to Mariabackup (xtrabackup --prepare), it is strictly out of the question to generate any redo log whatsoever, because that could break the restore of incremental backups by causing LSN deviation. So, the fil_names_dirty() call must be skipped when restoring backups.

          marko Marko Mäkelä added a comment - There were no server logs for the reported failure. Luckily, I ran into this very same assertion failure when testing a merge into 10.3 locally. For me, it failed for two tests that performed InnoDB crash recovery: innodb.innodb-alter-tempfile gcol.gcol_rollback In all the 3 crashes (the first test also with innodb_page_size=4k), the stack trace was like this: #8 0x000055fa2f7d41f6 in mtr_t::commit_checkpoint (this=0x7ffe265635f0, checkpoint_lsn=732924, write_mlog_checkpoint=true) at /mariadb/10.3/storage/innobase/mtr/mtr0mtr.cc:603 #9 0x000055fa2fa5ab92 in fil_names_clear (lsn=732924, do_write=true) at /mariadb/10.3/storage/innobase/fil/fil0fil.cc:6580 #10 0x000055fa2f7c6396 in recv_recovery_from_checkpoint_start ( flush_lsn=732225) at /mariadb/10.3/storage/innobase/log/log0recv.cc:3412 #11 0x000055fa2f8f2d99 in innobase_start_or_create_for_mysql () at /mariadb/10.3/storage/innobase/srv/srv0start.cc:2197 When it comes to the MariaDB server, this is a bogus debug assertion failure. While generating page-change redo log records is strictly out of the question during tat certain parts of crash recovery, the fil_names_clear() is only emitting informational MLOG_FILE_NAME records to guarantee that if the server is killed during or soon after the crash recovery, the subsequent crash recovery will succeed. The metadata buffer that fil_names_clear() is flushing to the redo log is being filled by recv_init_crash_recovery_spaces(), right before starting to apply redo log, by invoking fil_names_dirty() on every dirty tablespace that was found. When it comes to Mariabackup (xtrabackup --prepare), it is strictly out of the question to generate any redo log whatsoever, because that could break the restore of incremental backups by causing LSN deviation. So, the fil_names_dirty() call must be skipped when restoring backups.

          People

            marko Marko Mäkelä
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.