[MDEV-9040] 10.1.8 fails after upgrade from 10.0.21 Created: 2015-10-30  Updated: 2019-03-22  Resolved: 2015-11-05

Status: Closed
Project: MariaDB Server
Component/s: Platform RedHat
Affects Version/s: 10.1.8
Fix Version/s: 10.1.9

Type: Bug Priority: Critical
Reporter: Kym Farnik Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Centos 6.7 64bit


Attachments: File mysqld_log_2015-10-28-14-07-01.save    
Issue Links:
Relates
relates to MDEV-19016 Table in InnoDB data dictionary conta... Closed
Sprint: 10.1.9-2

 Description   

After the upgrade via yum the server gets into a restart loop...

2015-10-28 13:25:06 140165019297824 [Note] Recovering after a crash using tc.log
2015-10-28 13:25:06 140165019297824 [Note] Starting crash recovery...
2015-10-28 13:25:06 140165019297824 [Note] Crash recovery finished.
2015-10-28 13:25:06 140165019297824 [Note] Event Scheduler: Loaded 0 events
2015-10-28 13:25:06 7f7a109a5700 InnoDB: Error: Column last_update in table "mysql"."innodb_table_stats" is INT UNSIGNED NOT NULL but should be BINARY(4) NOT NULL (type mismatch).
2015-10-28 13:25:06 7f7a109a5700 InnoDB: Error: Fetch of persistent statistics requested for table "mysql"."gtid_slave_pos" but the required system tables mysql.innodb_table_stats and mysql.innodb_index_stats are not present or have unexpected structure. Using transient stats instead.
2015-10-28 13:25:06 140165019297824 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.1.8-MariaDB-log' socket: '/var/lib/mysql/mysql.sock' port: 0 MariaDB Server
151028 13:25:06 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.1.8-MariaDB-log
key_buffer_size=16777216
read_buffer_size=4194304
max_used_connections=1
max_threads=130
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 817738 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x48000
mysys/stacktrace.c:247(my_print_stacktrace)[0x7f7ab6d636cb]
sql/signal_handler.cc:160(handle_fatal_signal)[0x7f7ab68c50d5]
/lib64/libpthread.so.0(+0xf790)[0x7f7ab5ede790]
include/dict0dict.ic:1244(dict_index_get_nth_col)[0x7f7ab6b1557d]
row/row0purge.cc:850(row_purge_record_func)[0x7f7ab6ae45d5]
que/que0que.cc:1089(que_thr_step)[0x7f7ab6aaa84f]
trx/trx0purge.cc:1254(trx_purge(unsigned long, unsigned long, bool))[0x7f7ab6b121e1]
srv/srv0srv.cc:3432(srv_do_purge)[0x7f7ab6affdc5]
/lib64/libpthread.so.0(+0x7a51)[0x7f7ab5ed6a51]
/lib64/libc.so.6(clone+0x6d)[0x7f7ab43bc93d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.



 Comments   
Comment by Kym Farnik [ 2015-10-30 ]

8 GB memory 64 bit XEN based VM

/etc/my.cnf

[mysqld]
aria_pagecache_buffer_size=512M
character_set_server=utf8
collation_server=utf8_unicode_ci
connect_timeout=10
default-storage-engine=MyISAM
explicit_defaults_for_timestamp=1
ft_min_word_len=3
innodb_buffer_pool_instances=1
innodb_buffer_pool_size=1536M
innodb_data_file_path=ibdata1:10M:autoextend
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=2
innodb_flush_method=O_DIRECT
innodb_log_buffer_size=256M
innodb_log_buffer_size=8M
innodb_log_file_size=128M
innodb_log_files_in_group=2
innodb_open_files=512
innodb_thread_concurrency=0
innodb_read_io_threads=2
innodb_write_io_threads=2
innodb_adaptive_hash_index=0
innodb_use_native_aio=0
interactive_timeout=3600
join_buffer_size=1M
key_buffer_size=16M
local-infile=0
max_allowed_packet=268435456
max_connect_errors=10000
max_connections=128
max_heap_table_size=256M
max_user_connections=96
myisam_sort_buffer_size=2M
open_files_limit=10000
query_cache_limit=1M
query_cache_size=128M
query_cache_type=1
read_buffer_size=4M
read_rnd_buffer_size=2M
skip-networking
slow_query_log_file=/var/log/mysql/slow_query_log_file.log
slow_query_log=1
long_query_time=8
sort_buffer_size=2M
table_open_cache=4096
table_definition_cache=2048
thread_cache_size=128
tmp_table_size=256M
tmpdir="/dev/shm"
wait_timeout=600
[mysqld_safe]
nice=-5
innodb_file_per_table=1
open-files-limit=8192
log-error="/var/log/mysqld.log"
default-storage-engine=MyISAM
[myisamchk]
sort_buffer_size=16M
read_buffer_size=16M
key_buffer_size=64M
write_buffer_size=16M
[mysqldump]
max_allowed_packet=64M
quick

Comment by Kym Farnik [ 2015-10-30 ]

I have reverted to 10.0.21 (and subsequently updated to 10.0.22)

Comment by Kym Farnik [ 2015-10-30 ]

I've added the full log.
It shows 10.0.21 shutdown then all of the 10.1.8 stuff before I kill 10.1.8

Comment by Jan Lindström (Inactive) [ 2015-10-30 ]

Thanks for the info, can you provide the db for investigation ?

Comment by Kym Farnik [ 2015-10-30 ]

G'Day Jan
In what format? Remote Access? How can I do that securely?

Comment by Kym Farnik [ 2015-10-30 ]

I could give access via phpMyAdmin with a temporary password

Comment by Jan Lindström (Inactive) [ 2015-10-30 ]

You can upload compressed and tar'ed db to ftp://ftp.askmonty.org/ there is private directory.

Comment by Kym Farnik [ 2015-10-30 ]

Ok, just a mysqldump --all-databases ... or a tar/gzip of the data folders?

Comment by Jan Lindström (Inactive) [ 2015-10-30 ]

Just tar/gzip all of the data folders, mysqldump could remove the problem.

Comment by Kym Farnik [ 2015-10-30 ]

MDEV-9040.tar.gz 843MB file being FTP'ed now

Contains /var/lib/mysql ...

ftp> put MDEV-9040.tar.gz
local: MDEV-9040.tar.gz remote: MDEV-9040.tar.gz
227 Entering Passive Mode (194,136,193,154,238,109).
150 Ok to send data.
226 Transfer complete.
883618232 bytes sent in 3.98e+03 secs (222.03 Kbytes/sec)

Comment by Jan Lindström (Inactive) [ 2015-11-02 ]

Hi, I would need also your my.cnf file.

Comment by Kym Farnik [ 2015-11-02 ]

[mysqld]
aria_pagecache_buffer_size=512M
character_set_server=utf8
collation_server=utf8_unicode_ci
connect_timeout=10
default-storage-engine=MyISAM
explicit_defaults_for_timestamp=1
ft_min_word_len=3
innodb_buffer_pool_instances=1
innodb_buffer_pool_size=1536M
innodb_data_file_path=ibdata1:10M:autoextend
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=2
innodb_flush_method=O_DIRECT
innodb_log_buffer_size=256M
innodb_log_buffer_size=8M
innodb_log_file_size=128M
innodb_log_files_in_group=2
innodb_open_files=512
innodb_thread_concurrency=0
innodb_read_io_threads=2
innodb_write_io_threads=2
innodb_adaptive_hash_index=0
innodb_use_native_aio=0
interactive_timeout=3600
join_buffer_size=1M
key_buffer_size=16M
local-infile=0
max_allowed_packet=268435456
max_connect_errors=10000
max_connections=128
max_heap_table_size=256M
max_user_connections=96
myisam_sort_buffer_size=2M
open_files_limit=10000
query_cache_limit=1M
query_cache_size=128M
query_cache_type=1
read_buffer_size=4M
read_rnd_buffer_size=2M
skip-networking
slow_query_log_file=/var/log/mysql/slow_query_log_file.log
slow_query_log=1
long_query_time=8
sort_buffer_size=2M
table_open_cache=4096
table_definition_cache=2048
thread_cache_size=128
tmp_table_size=256M
tmpdir="/dev/shm"
wait_timeout=600
[mysqld_safe]
nice=-5
innodb_file_per_table=1
open-files-limit=8192
log-error="/var/log/mysqld.log"
default-storage-engine=MyISAM
[myisamchk]
sort_buffer_size=16M
read_buffer_size=16M
key_buffer_size=64M
write_buffer_size=16M
[mysqldump]
max_allowed_packet=64M
quick

Comment by Elena Stepanova [ 2015-11-03 ]

jplindst:
9d399c9f35ca5a85152adddc1c88a304f87f660c is where it started crashing. Stack trace from that commit:

Thread 1 (Thread 0x7f4249ffb700 (LWP 22324)):
#0  __pthread_kill (threadid=<optimized out>, signo=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:63
#1  0x00000000006f5ab2 in handle_fatal_signal (sig=11) at sql/signal_handler.cc:262
#2  <signal handler called>
#3  mach_read_from_1 (b=0x7f4249400000 <Address 0x7f4249400000 out of bounds>) at storage/xtradb/include/mach0data.ic:56
#4  mach_read_compressed (b=0x7f4249400000 <Address 0x7f4249400000 out of bounds>) at storage/xtradb/include/mach0data.ic:266
#5  trx_undo_rec_get_col_val (ptr=0x7f4249400000 <Address 0x7f4249400000 out of bounds>, field=field@entry=0x7f4249ffab68, len=len@entry=0x7f4249ffab70, orig_len=orig_len@entry=0x7f4249ffab78) at storage/xtradb/trx/trx0rec.cc:331
#6  0x00000000008cf589 in trx_undo_rec_get_partial_row (ptr=<optimized out>, index=0x7f424c7743e8, row=0x7f425b7b66e0, ignore_prefix=0, heap=<optimized out>) at storage/xtradb/trx/trx0rec.cc:1100
#7  0x00000000008a9a7d in row_purge_parse_undo_rec (thr=0x7f4267010eb0, updated_extern=0x7f4249ffac0f, undo_rec=0x7f4249021c80 "\004\264\214", node=0x7f425b7b6668) at storage/xtradb/row/row0purge.cc:776
#8  row_purge (thr=0x7f4267010eb0, undo_rec=0x7f4249021c80 "\004\264\214", node=0x7f425b7b6668) at storage/xtradb/row/row0purge.cc:859
#9  row_purge_step (thr=0x7f4267010eb0) at storage/xtradb/row/row0purge.cc:942
#10 0x000000000087d3bb in que_thr_step (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1115
#11 que_run_threads_low (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1177
#12 que_run_threads (thr=0x7f4267010eb0) at storage/xtradb/que/que0que.cc:1218
#13 0x00000000008cc827 in trx_purge (n_purge_threads=1, batch_size=300, truncate=false) at storage/xtradb/trx/trx0purge.cc:1251
#14 0x00000000008c0c8b in srv_do_purge (n_total_purged=<synthetic pointer>, n_threads=1) at storage/xtradb/srv/srv0srv.cc:3215
#15 srv_purge_coordinator_thread (arg=<optimized out>) at storage/xtradb/srv/srv0srv.cc:3397
#16 0x00007f4269861b50 in start_thread (arg=<optimized out>) at pthread_create.c:304
#17 0x00007f4267b1795d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#18 0x0000000000000000 in ?? ()

The direct parent is 0eb84da14712a9ca820533dbc1d911b3aead1658, it starts all right.

Comment by Kym Farnik [ 2015-11-04 ]

Good news ... it looks like you can reproduce the issue.
I'll stay on 10.0.x for now as it's a production system and I don't have much time window to try again.

I can probably try again next week, but methinks I need to wait for 10.1.9

Comment by Jan Lindström (Inactive) [ 2015-11-04 ]

Thanks Elena, confirmed your observation. Now I need to figure out what is wrong with that change. Actual change is very small but it seems to have some hidden effect on purge.

Comment by Jan Lindström (Inactive) [ 2015-11-05 ]

commit 25f8738112b05f33cfa45eabfebf6edfc80e6d8a
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Nov 5 09:42:23 2015 +0200

MDEV-9040: 10.1.8 fails after upgrade from 10.0.21

Analysis: Lengths which are not UNIV_SQL_NULL, but bigger than the following
number indicate that a field contains a reference to an externally
stored part of the field in the tablespace. The length field then
contains the sum of the following flag and the locally stored len.

This was incorrectly set to

define UNIV_EXTERN_STORAGE_FIELD (UNIV_SQL_NULL - UNIV_PAGE_SIZE_MAX)

When it should be

define UNIV_EXTERN_STORAGE_FIELD (UNIV_SQL_NULL - UNIV_PAGE_SIZE_DEF)

Additionally, we need to disable support for > 16K page size for
row compressed tables because a compressed page directory entry
reserves 14 bits for the start offset and 2 bits for flags.
This limits the uncompressed page size to 16k. To support
larger pages page directory entry needs to be larger.

Comment by Kym Farnik [ 2015-11-05 ]

Great work, thanks for the prompt action. This is my first bug report to MariaDB.

Generated at Thu Feb 08 07:31:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.