Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 10.1.22, 10.1.30
    • N/A
    • None
    • linux, backup taken by Zmanda

    Description

      Startup error log:

      mysql-maria-10.1.22 > $(pwd)/bin/mysqld --defaults-file=./my.cnf --user=mysql
      2018-01-29 13:18:42 140584210446592 [Note] /usr/local/mysql-maria-10.1.22/bin/mysqld (mysqld 10.1.22-MariaDB) starting as process 12876 ...
      2018-01-29 13:18:43 7fdc4fecb900 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Using mutexes to ref count buffer pool pages
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: The InnoDB memory heap is disabled
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Compressed tables use zlib 1.2.8
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Using Linux native AIO
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Using SSE crc32 instructions
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Initializing buffer pool, size = 2.0G
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Completed initialization of buffer pool
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Highest supported file format is Barracuda.
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Starting crash recovery from checkpoint LSN=685398305
      2018-01-29 13:18:43 140584210446592 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
      2018-01-29 13:18:44 140584210446592 [Note] InnoDB: Starting final batch to recover 23 pages from redo log
      180129 13:18:44 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed, 
      something is definitely wrong and this may fail.
       
      Server version: 10.1.22-MariaDB
      key_buffer_size=402653184
      read_buffer_size=2097152
      max_used_connections=0
      max_threads=153
      thread_count=0
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1023025 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x48400
      /usr/local/mysql-maria-10.1.22/bin/mysqld(my_print_stacktrace+0x29)[0x5635d695f549]
      /usr/local/mysql-maria-10.1.22/bin/mysqld(handle_fatal_signal+0x2f5)[0x5635d6547895]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fdc4e37f390]
      /usr/local/mysql-maria-10.1.22/bin/mysqld(+0x6fe605)[0x5635d66e6605]
      /usr/local/mysql-maria-10.1.22/bin/mysqld(+0x70007e)[0x5635d66e807e]
      /usr/local/mysql-maria-10.1.22/bin/mysqld(+0x6e5e70)[0x5635d66cde70]
      /usr/local/mysql-maria-10.1.22/bin/mysqld(+0x6e7a13)[0x5635d66cfa13]
      /usr/local/mysql-maria-10.1.22/bin/mysqld(+0x7c595f)[0x5635d67ad95f]
      /usr/local/mysql-maria-10.1.22/bin/mysqld(+0x80dbd9)[0x5635d67f5bd9]
      /usr/local/mysql-maria-10.1.22/bin/mysqld(+0x7612e8)[0x5635d67492e8]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fdc4e3756ba]
      /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fdc4da2041d]
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      Writing a core file
      Segmentation fault (core dumped)
      

      GDB backtrace:

      (gdb) bt
      #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
      #1  0x000055985134d15a in my_write_core (sig=sig@entry=11) at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/mysys/stacktrace.c:477
      #2  0x0000559850f35938 in handle_fatal_signal (sig=11) at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/sql/signal_handler.cc:296
      #3  <signal handler called>
      #4  mach_read_from_1 (b=0xfffffffffffffffb <error: Cannot access memory at address 0xfffffffffffffffb>)
          at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/include/mach0data.ic:55
      #5  rec_get_bit_field_1 (shift=0, mask=15, offs=5, rec=0x0) at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/include/rem0rec.ic:169
      #6  rec_get_n_owned_new (rec=0x0) at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/include/rem0rec.ic:549
      #7  page_rec_find_owner_rec (rec=0x7fa035ec807d "") at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/include/page0page.ic:934
      #8  page_cur_insert_rec_low (current_rec=0x7fa035ec8179 "", index=index@entry=0x7f9fd2833468, rec=0x7f9fdbffc295 "", offsets=<optimized out>, mtr=mtr@entry=0x7f9fdbffc7c0)
          at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/page/page0cur.cc:1152
      #9  0x00005598510d607e in page_cur_rec_insert (mtr=0x7f9fdbffc7c0, offsets=<optimized out>, index=0x7f9fd2833468, rec=<optimized out>, cursor=0x7f9fdbffbf60)
          at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/include/page0cur.ic:326
      #10 page_cur_parse_insert_rec (is_short=is_short@entry=0, ptr=0x7fa08e530883 <incomplete sequence \330>, end_ptr=end_ptr@entry=0x7fa08e53089d "", 
          block=block@entry=0x7fa025ff1390, index=0x7f9fd2833468, mtr=mtr@entry=0x7f9fdbffc7c0)
          at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/page/page0cur.cc:926
      #11 0x00005598510bbe70 in recv_parse_or_apply_log_rec_body (type=<optimized out>, ptr=<optimized out>, ptr@entry=0x7fa08e530870 "", end_ptr=0x7fa08e53089d "", 
          block=block@entry=0x7fa025ff1390, mtr=mtr@entry=0x7f9fdbffc7c0, space_id=<optimized out>)
          at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/log/log0recv.cc:1175
      #12 0x00005598510bda13 in recv_recover_page_func (just_read_in=just_read_in@entry=1, block=block@entry=0x7fa025ff1390)
          at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/log/log0recv.cc:1741
      #13 0x000055985119b95f in buf_page_io_complete (bpage=0x7fa025ff1390) at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/buf/buf0buf.cc:4856
      #14 0x00005598511e3bd9 in fil_aio_wait (segment=segment@entry=2) at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/fil/fil0fil.cc:6323
      #15 0x00005598511372e8 in io_handler_thread (arg=<optimized out>) at /home/hartmut/projects/mariadb/releases/mariadb-10.1.22/storage/xtradb/srv/srv0start.cc:546
      #16 0x00007fa0902f96ba in start_thread (arg=0x7f9fdbffd700) at pthread_create.c:333
      #17 0x00007fa08f9a441d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            Might be related to MDEV-14420 and/or MDEV-13192.

            elenst Elena Stepanova added a comment - - edited Might be related to MDEV-14420 and/or MDEV-13192 .

            I do not think that this is related to MDEV-14420, because this fails in a non-debug build and it is a B-tree page, not page number 0.
            This is also unrelated to MDEV-13192, because the table is not using ROW_FORMAT=COMPRESSED.
            It is the clustered index root page (page number 3) of tablespace 3.
            When applying the redo log record, the page is almost empty (no data after the "infimum" and "supremum").
            We have 2565 redo log records to that page according to --debug=d,ib_log. The problematic record appears to be right at the start of the recovery:

            2018-02-02 20:25:26 140737353852736 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
            innobase_init: ib_log: scan 685398305: log rec 24 len 9 page 0:301
            innobase_init: ib_log: scan 685398314: log rec 20 len 14 page 0:301
            innobase_init: ib_log: scan 685398328: log rec 38 len 64 page 4:1353
            innobase_init: ib_log: scan 685398392: multi-log rec 8 len 11 page 4:1301
            innobase_init: ib_log: scan 685398392: multi-log rec 38 len 14 page 4:1301
            innobase_init: ib_log: scan 685398392: multi-log rec 31 len 1 page 4:1301
            innobase_init: ib_log: scan 685398418: log rec 20 len 23 page 0:301
            innobase_init: ib_log: scan 685398441: log rec 38 len 48 page 3:3
            

            This could be a transaction that is writing undo log to page 0:301 and then modifying two tables, in tablespaces 3 and 4. The table in tablespace 3 should be almost empty, because the data appears to have been in the root page. The table name is mysql.gtid_slave_pos.
            Here is the page dump:

            root page of mysql.gtid_slave_pos

            000000 81 60 7b 75 00 00 00 03 ff ff ff ff ff ff ff ff
            000010 00 00 00 00 00 18 a1 2f 45 bf 00 00 00 00 00 00
            000020 00 00 00 00 00 03 00 02 00 78 80 02 00 00 00 00
            000030 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 00
            000040 00 00 00 00 00 00 00 00 00 13 00 00 00 03 00 00
            000050 00 02 00 f2 00 00 00 03 00 00 00 02 00 32 01 00
            000060 02 00 0d 69 6e 66 69 6d 75 6d 00 01 00 0b 00 00
            000070 73 75 70 72 65 6d 75 6d 00 00 00 00 00 00 00 00
            000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            003ff0 00 00 00 00 00 70 00 63 d0 27 09 cc 00 18 a1 2f
            004000
            

            Now, the problem is that the MLOG_COMP_PAGE_INSERT record (type 38) refers to the insert position 0x179 from the start of the page, but as you can see, the page is empty; it does not contain anything after the page supremum record at 0x70. The next-record pointer of the page infimum is pointing to the page supremum (the 16-bit big endian field at 0x61 contains the relative offset 0x0d).

            I see only one set of ib_logfile* in the provided mysql.zip, so I do not think that could have used the wrong redo logs for the data files.

            Above, we see that the 64-bit field FIL_PAGE_LSN at offset 0x10 is 0x18a12f, or 1,614,127. It is much less than the redo log LSN of 685,398,305. These numbers are roughly bytes written to the redo log. It is possible that the LSN 1,614,127 was written when the database was originally initialized. All the nonempty (first 4) pages of the file contain the same FIL_PAGE_LSN.

            It seems to me that something must have gone wrong when creating the backup. It looks like a much too old version of the data files was copied or restored. It is also possible that in case a file system snapshot was used, it did not work correctly.

            This does not look like a bug to me, but rather incorrect usage.

            My wild guess is that someone tried to replace the gtid_slave_pos.ibd with an empty file from a newly initialized instance, to work around some problem. In that case, it would be a better idea to delete the file altogether, and let InnoDB skip any redo log for it.

            marko Marko Mäkelä added a comment - I do not think that this is related to MDEV-14420 , because this fails in a non-debug build and it is a B-tree page, not page number 0. This is also unrelated to MDEV-13192 , because the table is not using ROW_FORMAT=COMPRESSED . It is the clustered index root page (page number 3) of tablespace 3. When applying the redo log record, the page is almost empty (no data after the "infimum" and "supremum"). We have 2565 redo log records to that page according to --debug=d,ib_log . The problematic record appears to be right at the start of the recovery: 2018-02-02 20:25:26 140737353852736 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer... innobase_init: ib_log: scan 685398305: log rec 24 len 9 page 0:301 innobase_init: ib_log: scan 685398314: log rec 20 len 14 page 0:301 innobase_init: ib_log: scan 685398328: log rec 38 len 64 page 4:1353 innobase_init: ib_log: scan 685398392: multi-log rec 8 len 11 page 4:1301 innobase_init: ib_log: scan 685398392: multi-log rec 38 len 14 page 4:1301 innobase_init: ib_log: scan 685398392: multi-log rec 31 len 1 page 4:1301 innobase_init: ib_log: scan 685398418: log rec 20 len 23 page 0:301 innobase_init: ib_log: scan 685398441: log rec 38 len 48 page 3:3 This could be a transaction that is writing undo log to page 0:301 and then modifying two tables, in tablespaces 3 and 4. The table in tablespace 3 should be almost empty, because the data appears to have been in the root page. The table name is mysql.gtid_slave_pos. Here is the page dump: root page of mysql.gtid_slave_pos 000000 81 60 7b 75 00 00 00 03 ff ff ff ff ff ff ff ff 000010 00 00 00 00 00 18 a1 2f 45 bf 00 00 00 00 00 00 000020 00 00 00 00 00 03 00 02 00 78 80 02 00 00 00 00 000030 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 00 000040 00 00 00 00 00 00 00 00 00 13 00 00 00 03 00 00 000050 00 02 00 f2 00 00 00 03 00 00 00 02 00 32 01 00 000060 02 00 0d 69 6e 66 69 6d 75 6d 00 01 00 0b 00 00 000070 73 75 70 72 65 6d 75 6d 00 00 00 00 00 00 00 00 000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 003ff0 00 00 00 00 00 70 00 63 d0 27 09 cc 00 18 a1 2f 004000 Now, the problem is that the MLOG_COMP_PAGE_INSERT record (type 38) refers to the insert position 0x179 from the start of the page, but as you can see, the page is empty; it does not contain anything after the page supremum record at 0x70. The next-record pointer of the page infimum is pointing to the page supremum (the 16-bit big endian field at 0x61 contains the relative offset 0x0d). I see only one set of ib_logfile* in the provided mysql.zip , so I do not think that could have used the wrong redo logs for the data files. Above, we see that the 64-bit field FIL_PAGE_LSN at offset 0x10 is 0x18a12f, or 1,614,127. It is much less than the redo log LSN of 685,398,305. These numbers are roughly bytes written to the redo log. It is possible that the LSN 1,614,127 was written when the database was originally initialized. All the nonempty (first 4) pages of the file contain the same FIL_PAGE_LSN . It seems to me that something must have gone wrong when creating the backup. It looks like a much too old version of the data files was copied or restored. It is also possible that in case a file system snapshot was used, it did not work correctly. This does not look like a bug to me, but rather incorrect usage. My wild guess is that someone tried to replace the gtid_slave_pos.ibd with an empty file from a newly initialized instance, to work around some problem. In that case, it would be a better idea to delete the file altogether, and let InnoDB skip any redo log for it.

            People

              marko Marko Mäkelä
              hholzgra Hartmut Holzgraefe
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.