Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9040

10.1.8 fails after upgrade from 10.0.21

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.1.8
    • 10.1.9
    • Platform RedHat
    • None
    • Centos 6.7 64bit
    • 10.1.9-2

    Description

      After the upgrade via yum the server gets into a restart loop...

      2015-10-28 13:25:06 140165019297824 [Note] Recovering after a crash using tc.log
      2015-10-28 13:25:06 140165019297824 [Note] Starting crash recovery...
      2015-10-28 13:25:06 140165019297824 [Note] Crash recovery finished.
      2015-10-28 13:25:06 140165019297824 [Note] Event Scheduler: Loaded 0 events
      2015-10-28 13:25:06 7f7a109a5700 InnoDB: Error: Column last_update in table "mysql"."innodb_table_stats" is INT UNSIGNED NOT NULL but should be BINARY(4) NOT NULL (type mismatch).
      2015-10-28 13:25:06 7f7a109a5700 InnoDB: Error: Fetch of persistent statistics requested for table "mysql"."gtid_slave_pos" but the required system tables mysql.innodb_table_stats and mysql.innodb_index_stats are not present or have unexpected structure. Using transient stats instead.
      2015-10-28 13:25:06 140165019297824 [Note] /usr/sbin/mysqld: ready for connections.
      Version: '10.1.8-MariaDB-log' socket: '/var/lib/mysql/mysql.sock' port: 0 MariaDB Server
      151028 13:25:06 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.

      To report this bug, see http://kb.askmonty.org/en/reporting-bugs

      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.

      Server version: 10.1.8-MariaDB-log
      key_buffer_size=16777216
      read_buffer_size=4194304
      max_used_connections=1
      max_threads=130
      thread_count=0
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 817738 K bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.

      Thread pointer: 0x0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x48000
      mysys/stacktrace.c:247(my_print_stacktrace)[0x7f7ab6d636cb]
      sql/signal_handler.cc:160(handle_fatal_signal)[0x7f7ab68c50d5]
      /lib64/libpthread.so.0(+0xf790)[0x7f7ab5ede790]
      include/dict0dict.ic:1244(dict_index_get_nth_col)[0x7f7ab6b1557d]
      row/row0purge.cc:850(row_purge_record_func)[0x7f7ab6ae45d5]
      que/que0que.cc:1089(que_thr_step)[0x7f7ab6aaa84f]
      trx/trx0purge.cc:1254(trx_purge(unsigned long, unsigned long, bool))[0x7f7ab6b121e1]
      srv/srv0srv.cc:3432(srv_do_purge)[0x7f7ab6affdc5]
      /lib64/libpthread.so.0(+0x7a51)[0x7f7ab5ed6a51]
      /lib64/libc.so.6(clone+0x6d)[0x7f7ab43bc93d]
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.

      Attachments

        Issue Links

          Activity

            jplindst:
            9d399c9f35ca5a85152adddc1c88a304f87f660c is where it started crashing. Stack trace from that commit:

            Thread 1 (Thread 0x7f4249ffb700 (LWP 22324)):
            #0  __pthread_kill (threadid=<optimized out>, signo=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:63
            #1  0x00000000006f5ab2 in handle_fatal_signal (sig=11) at sql/signal_handler.cc:262
            #2  <signal handler called>
            #3  mach_read_from_1 (b=0x7f4249400000 <Address 0x7f4249400000 out of bounds>) at storage/xtradb/include/mach0data.ic:56
            #4  mach_read_compressed (b=0x7f4249400000 <Address 0x7f4249400000 out of bounds>) at storage/xtradb/include/mach0data.ic:266
            #5  trx_undo_rec_get_col_val (ptr=0x7f4249400000 <Address 0x7f4249400000 out of bounds>, field=field@entry=0x7f4249ffab68, len=len@entry=0x7f4249ffab70, orig_len=orig_len@entry=0x7f4249ffab78) at storage/xtradb/trx/trx0rec.cc:331
            #6  0x00000000008cf589 in trx_undo_rec_get_partial_row (ptr=<optimized out>, index=0x7f424c7743e8, row=0x7f425b7b66e0, ignore_prefix=0, heap=<optimized out>) at storage/xtradb/trx/trx0rec.cc:1100
            #7  0x00000000008a9a7d in row_purge_parse_undo_rec (thr=0x7f4267010eb0, updated_extern=0x7f4249ffac0f, undo_rec=0x7f4249021c80 "\004\264\214", node=0x7f425b7b6668) at storage/xtradb/row/row0purge.cc:776
            #8  row_purge (thr=0x7f4267010eb0, undo_rec=0x7f4249021c80 "\004\264\214", node=0x7f425b7b6668) at storage/xtradb/row/row0purge.cc:859
            #9  row_purge_step (thr=0x7f4267010eb0) at storage/xtradb/row/row0purge.cc:942
            #10 0x000000000087d3bb in que_thr_step (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1115
            #11 que_run_threads_low (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1177
            #12 que_run_threads (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1218
            #13 0x00000000008cc827 in trx_purge (n_purge_threads=1, batch_size=300, truncate=false) at storage/xtradb/trx/trx0purge.cc:1251
            #14 0x00000000008c0c8b in srv_do_purge (n_total_purged=<synthetic pointer>, n_threads=1) at storage/xtradb/srv/srv0srv.cc:3215
            #15 srv_purge_coordinator_thread (arg=<optimized out>) at storage/xtradb/srv/srv0srv.cc:3397
            #16 0x00007f4269861b50 in start_thread (arg=<optimized out>) at pthread_create.c:304
            #17 0x00007f4267b1795d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
            #18 0x0000000000000000 in ?? ()

            The direct parent is 0eb84da14712a9ca820533dbc1d911b3aead1658, it starts all right.

            elenst Elena Stepanova added a comment - jplindst : 9d399c9f35ca5a85152adddc1c88a304f87f660c is where it started crashing. Stack trace from that commit: Thread 1 (Thread 0x7f4249ffb700 (LWP 22324)): #0 __pthread_kill (threadid=<optimized out>, signo=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:63 #1 0x00000000006f5ab2 in handle_fatal_signal (sig=11) at sql/signal_handler.cc:262 #2 <signal handler called> #3 mach_read_from_1 (b=0x7f4249400000 <Address 0x7f4249400000 out of bounds>) at storage/xtradb/include/mach0data.ic:56 #4 mach_read_compressed (b=0x7f4249400000 <Address 0x7f4249400000 out of bounds>) at storage/xtradb/include/mach0data.ic:266 #5 trx_undo_rec_get_col_val (ptr=0x7f4249400000 <Address 0x7f4249400000 out of bounds>, field=field@entry=0x7f4249ffab68, len=len@entry=0x7f4249ffab70, orig_len=orig_len@entry=0x7f4249ffab78) at storage/xtradb/trx/trx0rec.cc:331 #6 0x00000000008cf589 in trx_undo_rec_get_partial_row (ptr=<optimized out>, index=0x7f424c7743e8, row=0x7f425b7b66e0, ignore_prefix=0, heap=<optimized out>) at storage/xtradb/trx/trx0rec.cc:1100 #7 0x00000000008a9a7d in row_purge_parse_undo_rec (thr=0x7f4267010eb0, updated_extern=0x7f4249ffac0f, undo_rec=0x7f4249021c80 "\004\264\214", node=0x7f425b7b6668) at storage/xtradb/row/row0purge.cc:776 #8 row_purge (thr=0x7f4267010eb0, undo_rec=0x7f4249021c80 "\004\264\214", node=0x7f425b7b6668) at storage/xtradb/row/row0purge.cc:859 #9 row_purge_step (thr=0x7f4267010eb0) at storage/xtradb/row/row0purge.cc:942 #10 0x000000000087d3bb in que_thr_step (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1115 #11 que_run_threads_low (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1177 #12 que_run_threads (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1218 #13 0x00000000008cc827 in trx_purge (n_purge_threads=1, batch_size=300, truncate=false) at storage/xtradb/trx/trx0purge.cc:1251 #14 0x00000000008c0c8b in srv_do_purge (n_total_purged=<synthetic pointer>, n_threads=1) at storage/xtradb/srv/srv0srv.cc:3215 #15 srv_purge_coordinator_thread (arg=<optimized out>) at storage/xtradb/srv/srv0srv.cc:3397 #16 0x00007f4269861b50 in start_thread (arg=<optimized out>) at pthread_create.c:304 #17 0x00007f4267b1795d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #18 0x0000000000000000 in ?? () The direct parent is 0eb84da14712a9ca820533dbc1d911b3aead1658, it starts all right.
            CypherOz Kym Farnik added a comment -

            Good news ... it looks like you can reproduce the issue.
            I'll stay on 10.0.x for now as it's a production system and I don't have much time window to try again.

            I can probably try again next week, but methinks I need to wait for 10.1.9

            CypherOz Kym Farnik added a comment - Good news ... it looks like you can reproduce the issue. I'll stay on 10.0.x for now as it's a production system and I don't have much time window to try again. I can probably try again next week, but methinks I need to wait for 10.1.9

            Thanks Elena, confirmed your observation. Now I need to figure out what is wrong with that change. Actual change is very small but it seems to have some hidden effect on purge.

            jplindst Jan Lindström (Inactive) added a comment - Thanks Elena, confirmed your observation. Now I need to figure out what is wrong with that change. Actual change is very small but it seems to have some hidden effect on purge.

            commit 25f8738112b05f33cfa45eabfebf6edfc80e6d8a
            Author: Jan Lindström <jan.lindstrom@mariadb.com>
            Date: Thu Nov 5 09:42:23 2015 +0200

            MDEV-9040: 10.1.8 fails after upgrade from 10.0.21

            Analysis: Lengths which are not UNIV_SQL_NULL, but bigger than the following
            number indicate that a field contains a reference to an externally
            stored part of the field in the tablespace. The length field then
            contains the sum of the following flag and the locally stored len.

            This was incorrectly set to

            define UNIV_EXTERN_STORAGE_FIELD (UNIV_SQL_NULL - UNIV_PAGE_SIZE_MAX)

            When it should be

            define UNIV_EXTERN_STORAGE_FIELD (UNIV_SQL_NULL - UNIV_PAGE_SIZE_DEF)

            Additionally, we need to disable support for > 16K page size for
            row compressed tables because a compressed page directory entry
            reserves 14 bits for the start offset and 2 bits for flags.
            This limits the uncompressed page size to 16k. To support
            larger pages page directory entry needs to be larger.

            jplindst Jan Lindström (Inactive) added a comment - commit 25f8738112b05f33cfa45eabfebf6edfc80e6d8a Author: Jan Lindström <jan.lindstrom@mariadb.com> Date: Thu Nov 5 09:42:23 2015 +0200 MDEV-9040 : 10.1.8 fails after upgrade from 10.0.21 Analysis: Lengths which are not UNIV_SQL_NULL, but bigger than the following number indicate that a field contains a reference to an externally stored part of the field in the tablespace. The length field then contains the sum of the following flag and the locally stored len. This was incorrectly set to define UNIV_EXTERN_STORAGE_FIELD (UNIV_SQL_NULL - UNIV_PAGE_SIZE_MAX) When it should be define UNIV_EXTERN_STORAGE_FIELD (UNIV_SQL_NULL - UNIV_PAGE_SIZE_DEF) Additionally, we need to disable support for > 16K page size for row compressed tables because a compressed page directory entry reserves 14 bits for the start offset and 2 bits for flags. This limits the uncompressed page size to 16k. To support larger pages page directory entry needs to be larger.
            CypherOz Kym Farnik added a comment -

            Great work, thanks for the prompt action. This is my first bug report to MariaDB.

            CypherOz Kym Farnik added a comment - Great work, thanks for the prompt action. This is my first bug report to MariaDB.

            People

              jplindst Jan Lindström (Inactive)
              CypherOz Kym Farnik
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.