[MDEV-13787] Crash in persistent stats wsrep_on (thd=0x0) Created: 2017-09-12  Updated: 2017-09-28  Resolved: 2017-09-27

Status: Closed
Project: MariaDB Server
Component/s: Galera, Storage Engine - InnoDB
Affects Version/s: 10.1.27
Fix Version/s: 10.1.28

Type: Bug Priority: Blocker
Reporter: Andrii Nikitin (Inactive) Assignee: Sachin Setiya (Inactive)
Resolution: Fixed Votes: 1
Labels: None

Attachments: File patch.diff     File patch.diff     File patch.diff    
Issue Links:
Duplicate
duplicates MDEV-13908 10.1.27 Segfaulting Closed
duplicates MDEV-13910 after update not starting Closed
Problem/Incident
is caused by MDEV-7955 WSREP() appears on radar in OLTP RO Closed

 Description   

Following galera test will cause crash when runs with -mysqld=-innodb-stats-persistent=1

--source include/galera_cluster.inc
--source include/have_innodb.inc
--connection node_1
create table t(a int);
insert into t select 1;
DROP TABLE t;

m2-10.1/build/mysql-test/mtr x --mysqld=--innodb-stats-persistent=1
Logging: /home/a/mariadb-environs/_depot/m-branch/10.1/mysql-test/mysql-test-run.pl  x --mysqld=--innodb-stats-persistent=1
vardir: /home/a/mariadb-environs/m2-10.1/build/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/a/mariadb-environs/m2-10.1/build/mysql-test/var'...
Checking supported features...
MariaDB Version 10.1.27-MariaDB-debug
 - SSL connections supported
 - binaries are debug compiled
Sphinx 'indexer' binary not found, sphinx suite will be skipped
Collecting tests...
Installing system database...
 
==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
create table t(a int);
insert into t select 1;
DROP TABLE t;
galera.x 'innodb_plugin'                 [ fail ]
        Test ended at 2017-09-13 07:04:54
 
CURRENT_TEST: galera.x
 
 
Server [mysqld.2 - pid: 12498, winpid: 12498, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
2017-09-13  7:04:45 140338849577920 [Warning] WSREP: option --wsrep-causal-reads is deprecated
2017-09-13  7:04:45 140338849577920 [Note] /home/a/mariadb-environs/m2-10.1/build/sql/mysqld (mysqld 10.1.27-MariaDB-debug) starting as process 12499 ...
2017-09-13  7:04:45 140338849577920 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4162)
2017-09-13  7:04:45 140338849577920 [Warning] WSREP: wsrep_sst_receive_address is set to '127.0.0.1:16007' which makes it impossible for another host to reach this one. Please set it to the address which this node can be connected at by other cluster members.
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Read nil XID from storage engines, skipping position init
2017-09-13  7:04:45 140338849577920 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
2017-09-13  7:04:45 140338849577920 [Note] WSREP: wsrep_load(): Galera 25.3.20(r3703) by Codership Oy <info@codership.com> loaded successfully.
2017-09-13  7:04:45 140338849577920 [Note] WSREP: CRC-32C: using hardware acceleration.
2017-09-13  7:04:45 140338849577920 [Warning] WSREP: Could not open state file for reading: '/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/mysqld.2/data//grastate.dat'
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootsrap: 1
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Passing config to GCS: base_dir = /home/a/mariadb-environs/m2-10.1/build/mysql-test/var/mysqld.2/data/; base_host = 127.0.0.1; base_port = 16005; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT30S; evs.install_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT10S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /home/a/mariadb-environs/m2-10.1/build/mysql-test/var/mysqld.2/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /home/a/mariadb-environs/m2-10.1/build/mysql-test/var/mysqld.2/data//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gc
2017-09-13  7:04:45 140338849577920 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(00000000-0000-0000-0000-000000000000:-1)
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2017-09-13  7:04:45 140338849577920 [Note] WSREP: wsrep_sst_grab()
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Start replication
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2017-09-13  7:04:45 140338849577920 [Note] WSREP: protonet asio version 0
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Using CRC-32C for message checksums.
2017-09-13  7:04:45 140338849577920 [Note] WSREP: backend: asio
2017-09-13  7:04:45 140338849577920 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2017-09-13  7:04:45 140338849577920 [Warning] WSREP: access file(/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/mysqld.2/data//gvwstate.dat) failed(No such file or directory)
2017-09-13  7:04:45 140338849577920 [Note] WSREP: restore pc from disk failed
2017-09-13  7:04:45 140338849577920 [Note] WSREP: GMCast version 0
2017-09-13  7:04:45 140338849577920 [Note] WSREP: (0f11b6d5, 'tcp://0.0.0.0:16005') listening at tcp://0.0.0.0:16005
2017-09-13  7:04:45 140338849577920 [Note] WSREP: (0f11b6d5, 'tcp://0.0.0.0:16005') multicast: , ttl: 1
2017-09-13  7:04:45 140338849577920 [Note] WSREP: EVS version 0
2017-09-13  7:04:45 140338849577920 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '127.0.0.1:16002'
2017-09-13  7:04:45 140338849577920 [Note] WSREP: (0f11b6d5, 'tcp://0.0.0.0:16005') connection established to 0f11b915 tcp://127.0.0.1:16002
2017-09-13  7:04:45 140338849577920 [Note] WSREP: (0f11b6d5, 'tcp://0.0.0.0:16005') turning message relay requesting on, nonlive peers: 
2017-09-13  7:04:45 140338849577920 [Note] WSREP: declaring 0f11b915 at tcp://127.0.0.1:16002 stable
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Node 0f11b915 state prim
2017-09-13  7:04:45 140338849577920 [Note] WSREP: view(view_id(PRIM,0f11b6d5,2) memb {
	0f11b6d5,0
	0f11b915,0
} joined {
} left {
} partitioned {
})
2017-09-13  7:04:45 140338849577920 [Note] WSREP: save pc into disk
2017-09-13  7:04:45 140338849577920 [Note] WSREP: gcomm: connected
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Opened channel 'my_wsrep_cluster'
2017-09-13  7:04:45 140338509375232 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
2017-09-13  7:04:45 140338849577920 [Note] WSREP: Waiting for SST to complete.
2017-09-13  7:04:45 140338509375232 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 0f5e2ea4-9841-11e7-97bd-63fcde780f96
2017-09-13  7:04:45 140338509375232 [Note] WSREP: STATE EXCHANGE: sent state msg: 0f5e2ea4-9841-11e7-97bd-63fcde780f96
2017-09-13  7:04:45 140338509375232 [Note] WSREP: STATE EXCHANGE: got state msg: 0f5e2ea4-9841-11e7-97bd-63fcde780f96 from 0 (UBINTI)
2017-09-13  7:04:45 140338509375232 [Note] WSREP: STATE EXCHANGE: got state msg: 0f5e2ea4-9841-11e7-97bd-63fcde780f96 from 1 (UBINTI)
2017-09-13  7:04:45 140338509375232 [Note] WSREP: Quorum results:
	version    = 4,
	component  = PRIMARY,
	conf_id    = 1,
	members    = 1/2 (joined/total),
	act_id     = 0,
	last_appl. = -1,
	protocols  = 0/7/3 (gcs/repl/appl),
	group UUID = 0f11e849-9841-11e7-9f1f-9730a47e5f43
2017-09-13  7:04:45 140338509375232 [Note] WSREP: Flow-control interval: [23, 23]
2017-09-13  7:04:45 140338509375232 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 0)
2017-09-13  7:04:45 140338849258240 [Note] WSREP: State transfer required: 
	Group state: 0f11e849-9841-11e7-9f1f-9730a47e5f43:0
	Local state: 00000000-0000-0000-0000-000000000000:-1
2017-09-13  7:04:45 140338849258240 [Note] WSREP: New cluster view: global state: 0f11e849-9841-11e7-9f1f-9730a47e5f43:0, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 3
2017-09-13  7:04:45 140338849258240 [Warning] WSREP: Gap in state sequence. Need state transfer.
2017-09-13  7:04:45 140338480019200 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '127.0.0.1:16007' --datadir '/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/mysqld.2/data/'  --defaults-file '/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/my.cnf'  --parent '12499'  '' '
2017-09-13  7:04:45 140338849258240 [Note] WSREP: Prepared SST request: rsync|127.0.0.1:16007/rsync_sst
2017-09-13  7:04:45 140338849258240 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-09-13  7:04:45 140338849258240 [Note] WSREP: REPL Protocols: 7 (3, 2)
2017-09-13  7:04:45 140338849258240 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
2017-09-13  7:04:45 140338567313152 [Note] WSREP: Service thread queue flushed.
2017-09-13  7:04:45 140338849258240 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (0f11e849-9841-11e7-9f1f-9730a47e5f43): 1 (Operation not permitted)
	 at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2017-09-13  7:04:45 140338509375232 [Note] WSREP: Member 0.0 (UBINTI) requested state transfer from '*any*'. Selected 1.0 (UBINTI)(SYNCED) as donor.
2017-09-13  7:04:45 140338509375232 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
2017-09-13  7:04:45 140338849258240 [Note] WSREP: Requesting state transfer: success, donor: 1
2017-09-13  7:04:45 140338849258240 [Note] WSREP: GCache history reset: old(00000000-0000-0000-0000-000000000000:0) -> new(0f11e849-9841-11e7-9f1f-9730a47e5f43:0)
2017-09-13  7:04:48 140338509375232 [Note] WSREP: 1.0 (UBINTI): State transfer to 0.0 (UBINTI) complete.
2017-09-13  7:04:48 140338509375232 [Note] WSREP: Member 1.0 (UBINTI) synced with group.
2017-09-13  7:04:48 140338517767936 [Note] WSREP: (0f11b6d5, 'tcp://0.0.0.0:16005') turning message relay requesting off
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 12582 (20170913 07:04:48.732)
WSREP_SST: [INFO] Joiner cleanup done. (20170913 07:04:49.235)
2017-09-13  7:04:49 140338849577920 [Note] WSREP: SST complete, seqno: 0
2017-09-13  7:04:49 140338849577920 [Note] Plugin 'partition' is disabled.
2017-09-13  7:04:49 140338849577920 [Note] Plugin 'SEQUENCE' is disabled.
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Using mutexes to ref count buffer pool pages
2017-09-13  7:04:49 140338849577920 [Note] InnoDB:  InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
2017-09-13  7:04:49 140338849577920 [Note] InnoDB:  InnoDB: !!!!!!!! UNIV_SYNC_DEBUG switched on !!!!!!!!!
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: The InnoDB memory heap is disabled
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Compressed tables use zlib 1.2.8
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Using Linux native AIO
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Using SSE crc32 instructions
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Initializing buffer pool, size = 8.0M
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Completed initialization of buffer pool
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Highest supported file format is Barracuda.
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1629454
2017-09-13  7:04:49 140338849577920 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
2017-09-13  7:04:50 140338849577920 [Note] InnoDB: 128 rollback segment(s) are active.
2017-09-13  7:04:50 140338849577920 [Note] InnoDB: Waiting for purge to start
2017-09-13  7:04:50 140338849577920 [Note] InnoDB: 5.6.37 started; log sequence number 1629618
2017-09-13  7:04:50 140338245138176 [Note] InnoDB: Dumping buffer pool(s) not yet started
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_SYS_DATAFILES' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_SYS_TABLESTATS' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_CMP' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'XTRADB_READ_VIEW' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'XTRADB_RSEG' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_FT_DELETED' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_LOCK_WAITS' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'CHANGED_PAGE_BITMAPS' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_TABLESPACES_ENCRYPTION' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_SYS_FIELDS' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'XTRADB_INTERNAL_HASH_TABLES' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_TABLESPACES_SCRUBBING' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_CMPMEM_RESET' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_FT_INDEX_TABLE' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_FT_BEING_DELETED' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_MUTEXES' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_FT_CONFIG' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'FEEDBACK' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_FT_INDEX_CACHE' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_FT_DEFAULT_STOPWORD' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_CMP_RESET' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_SYS_COLUMNS' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_CMP_PER_INDEX_RESET' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_SYS_TABLESPACES' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_SYS_SEMAPHORE_WAITS' is disabled.
2017-09-13  7:04:50 140338849577920 [Note] Plugin 'INNODB_SYS_INDEXES' is disabled.
2017-09-13  7:04:50 140338849577920 [Warning] /home/a/mariadb-environs/m2-10.1/build/sql/mysqld: unknown option '--loose-pam-debug'
2017-09-13  7:04:50 140338849577920 [Warning] /home/a/mariadb-environs/m2-10.1/build/sql/mysqld: unknown option '--loose-skip-plugin-innodb-changed-pages'
2017-09-13  7:04:50 140338849577920 [Note] Server socket created on IP: '127.0.0.1'.
2017-09-13  7:04:50 140338849577920 [Note] WSREP: Signalling provider to continue.
2017-09-13  7:04:50 140338849577920 [Note] WSREP: SST received: 0f11e849-9841-11e7-9f1f-9730a47e5f43:0
2017-09-13  7:04:50 140338509375232 [Note] WSREP: 0.0 (UBINTI): State transfer from 1.0 (UBINTI) complete.
2017-09-13  7:04:50 140338509375232 [Note] WSREP: Shifting JOINER -> JOINED (TO: 0)
2017-09-13  7:04:50 140338509375232 [Note] WSREP: Member 0.0 (UBINTI) synced with group.
2017-09-13  7:04:50 140338509375232 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 0)
2017-09-13  7:04:50 140338849258240 [Note] WSREP: Synchronized with group, ready for connections
2017-09-13  7:04:50 140338849258240 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2017-09-13  7:04:50 140338849577920 [Note] /home/a/mariadb-environs/m2-10.1/build/sql/mysqld: ready for connections.
Version: '10.1.27-MariaDB-debug'  socket: '/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/tmp/mysqld.2.sock'  port: 16001  Source distribution
170913  7:04:53 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.1.27-MariaDB-debug
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=3
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63015 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7fa319416070
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fa32f420e98 thread_stack 0x48400
/home/a/mariadb-environs/m2-10.1/build/sql/mysqld(my_print_stacktrace+0x38)[0x5620260476ba]
/home/a/mariadb-environs/m2-10.1/build/sql/mysqld(handle_fatal_signal+0x394)[0x5620259e3afd]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11630)[0x7fa32df2a630]
/home/a/mariadb-environs/m2-10.1/build/sql/mysqld(wsrep_on+0x1e)[0x56202595995d]
mysys/stacktrace.c:267(my_print_stacktrace)[0x7fa317c3cf66]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x2db5c3)[0x7fa317c3d5c3]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x2db8f8)[0x7fa317c3d8f8]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x2dbd70)[0x7fa317c3dd70]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x251860)[0x7fa317bb3860]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x251b48)[0x7fa317bb3b48]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x251cd6)[0x7fa317bb3cd6]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x251f1c)[0x7fa317bb3f1c]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x11294f)[0x7fa317a7494f]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x118b51)[0x7fa317a7ab51]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x118d7c)[0x7fa317a7ad7c]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x29a417)[0x7fa317bfc417]
/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/plugins/ha_innodb.so(+0x193b4c)[0x7fa317af5b4c]
sql/handler.cc:4327(handler::ha_delete_table(char const*))[0x5620259ee58a]
sql/handler.cc:2382(ha_delete_table(THD*, handlerton*, char const*, char const*, char const*, bool))[0x5620259e965e]
sql/sql_table.cc:2468(mysql_rm_table_no_locks(THD*, TABLE_LIST*, bool, bool, bool, bool, bool))[0x562025860239]
sql/sql_table.cc:2083(mysql_rm_table(THD*, TABLE_LIST*, char, char))[0x56202585f455]
sql/sql_parse.cc:4245(mysql_execute_command(THD*))[0x5620257adbd2]
sql/sql_parse.cc:7333(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x5620257b76a7]
sql/log_event.cc:4455(Query_log_event::do_apply_event(rpl_group_info*, char const*, unsigned int))[0x562025adafd6]
sql/log_event.cc:4166(Query_log_event::do_apply_event(rpl_group_info*))[0x562025ada14f]
sql/log_event.h:1343(Log_event::apply_event(rpl_group_info*))[0x562025713749]
sql/wsrep_applier.cc:165(wsrep_apply_events(THD*, void const*, unsigned long))[0x56202596717c]
sql/wsrep_applier.cc:265(wsrep_apply_cb)[0x562025967902]
/usr/lib/galera/libgalera_smm.so(_ZNK6galera9TrxHandle5applyEPvPF15wsrep_cb_statusS1_PKvmjPK14wsrep_trx_metaERS6_+0xfe)[0x7fa3269214be]
/usr/lib/galera/libgalera_smm.so(+0x2321af)[0x7fa3269731af]
/usr/lib/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM9apply_trxEPvPNS_9TrxHandleE+0xbe)[0x7fa326975d5e]
/usr/lib/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM11process_trxEPvPNS_9TrxHandleE+0x13e)[0x7fa32697878e]
/usr/lib/galera/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_actionRb+0x1d8)[0x7fa32694ee48]
/usr/lib/galera/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPvRb+0x76)[0x7fa326950b06]
/usr/lib/galera/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x8b)[0x7fa326978e8b]
/usr/lib/galera/libgalera_smm.so(galera_recv+0x2b)[0x7fa32699336b]
sql/wsrep_thd.cc:360(wsrep_replication_process(THD*))[0x562025969733]
sql/wsrep_mysqld.cc:1928(start_wsrep_THD)[0x5620259583ea]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ca)[0x7fa32df206ca]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x5f)[0x7fa32d5c9caf]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fa31945117b): DROP TABLE t
Connection ID (thread ID): 2
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
----------SERVER LOG END-------------
mysqltest failed but provided no output
 
 
 - saving '/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/log/galera.x-innodb_plugin/' to '/home/a/mariadb-environs/m2-10.1/build/mysql-test/var/log/galera.x-innodb_plugin/'
 - found 'core' (0/5)
 
Trying 'dbx' to get a backtrace
 
Trying 'gdb' to get a backtrace from coredump /home/a/mariadb-environs/m2-10.1/build/mysql-test/var/log/galera.x-innodb_plugin/mysqld.1/data/core
 
Only  1  of 2 completed.
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.000 of 17 seconds executing testcases
 
Failure: Failed 1/1 tests, 0.00% were successful.
 
Failing test(s): galera.x
 
The log files in var/log may give you some hint of what went wrong.
 
If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html
 
mysql-test-run: *** ERROR: there were failing test cases

Below is resolved stack trace:

2017-09-12 18:32:04 140203330374400 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
170912 18:32:25 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.1.27-MariaDB
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=3
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467135 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x48400
/home/a/mariadb-environs/m6-10.1/build/sql/mysqld(my_print_stacktrace+0x29)[0x556ba491e569]
/home/a/mariadb-environs/m6-10.1/build/sql/mysqld(handle_fatal_signal+0x3ad)[0x556ba45000dd]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11630)[0x7f83a06f3630]
/home/a/mariadb-environs/m6-10.1/build/sql/mysqld(wsrep_on+0x12)[0x556ba44a0462]
/home/a/mariadb-environs/m6-10.1/build/sql/mysqld(+0x8598ee)[0x556ba47ee8ee]
mysys/stacktrace.c:268(my_print_stacktrace)[0x556ba47ef003]
sql/wsrep_mysqld.cc:2298(wsrep_on)[0x556ba47adec0]
que/que0que.cc:1277(que_eval_sql(pars_info_t*, char const*, unsigned long, trx_t*))[0x556ba47ae5cf]
dict/dict0stats.cc:332(dict_stats_exec_sql(pars_info_t*, char const*, trx_t*))[0x556ba4898f48]
dict/dict0stats.cc:2573(dict_stats_save(dict_table_t*, unsigned long const*) [clone .part.65])[0x556ba489e699]
dict/dict0stats.cc:3305(dict_stats_update(dict_table_t*, dict_stats_upd_option_t))[0x556ba489f818]
dict/dict0stats_bg.cc:464(dict_stats_process_entry_from_recalc_pool)[0x556ba48a11e5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ca)[0x7f83a06e96ca]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x5f)[0x7f839fd92caf]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
170912 18:32:25 mysqld_safe WSREP: not restarting wsrep node automatically



 Comments   
Comment by Sachin Setiya (Inactive) [ 2017-09-13 ]

patch.diff

Comment by Marko Mäkelä [ 2017-09-13 ]

In the patch, we are skipping write-set replication for some InnoDB lock requests.
I think that this is very dangerous. Please confirm from the Galera developers if running with innodb_stats_persistent=1 is supported or not.

Are there any Galera tests that modify mysql.innodb_index_stats and mysql.innodb_table_stats from SQL while persistent statistics are being recalculated on the background? If not yet and if persistent statistics are supposed to be supported, please write such tests.

To avoid code duplication, I would suggest to introduce a predicate like

inline bool wsrep_on_trx(const trx_t* trx)
{
	return trx->mysql_thd && wsrep_on(trx->mysql_thd);
}

Comment by Jan Lindström (Inactive) [ 2017-09-26 ]

I think there is some misunderstanding here. Write set replication is hooked on mysql_parse. All InnoDB internal parser executions are done only on inside a single node. Therefore, updates to InnoDB system tables i.e. data dictionary and these persistent statistic changes are done only on that node. Naturally, similar operation is done on all other nodes. If those tables are user updated using SQL those operations are included to write set replication. You could add a test case to show that this last part is really done.

Comment by Marko Mäkelä [ 2017-09-26 ]

jplindst, does Galera block user SQL access to the tables mysql.innodb_index_stats and mysql.innodb_table_stats? If not, I fear that it will be seriously broken, because the InnoDB internal access would skip write-set replication while the user-level SQL would go through it.

Note that the InnoDB internal updates to these two tables are also incompatible with normal MySQL replication, for two reasons: the operations are not covered by MDL, and they are not covered by the binlog.

Comment by Sachin Setiya (Inactive) [ 2017-09-26 ]

This issue is caused by commit b4616c40be00c5c8a2a73d537d67

Comment by Andrii Nikitin (Inactive) [ 2017-09-26 ]

With hint from marko I confirm that exact problem was injected with merge efc0ec7631123c9219ac5d0723875a2748f624ee .

Comment by Sachin Setiya (Inactive) [ 2017-09-26 ]

Latest Patch. patch.diff

Comment by Marko Mäkelä [ 2017-09-26 ]

I would say that this issue was caused by two changes. The optimization of wsrep_on() in MDEV-7955 did not break anything immediately. The follow-up patches to the FOREIGN KEY code that were recently merged to 10.1 did. These would be the two commits:
MW-369 FK fixes and MW-369 FK fixes.

In 10.2, I restructured the code in MDEV-13498, but I suppose that it could crash also there if you defined a FOREIGN KEY that uses one of the tables mysql.innodb_table_stats and mysql.innodb_index_stats as the child or parent table.

The revised patch looks OK to me except for the following snippets:

@@ -402,7 +402,7 @@ lock_wait_suspend_thread(
     if (lock_wait_timeout < 100000000
         && wait_time > (double) lock_wait_timeout) {
 #ifdef WITH_WSREP
-                if (!wsrep_on(trx->mysql_thd) ||
+                if (wsrep_on_trx(trx) ||
                     (!wsrep_is_BF_lock_timeout(trx) &&
                      trx->error_state != DB_DEADLOCK)) {
 #endif /* WITH_WSREP */

We surely should not invert the condition here.

Another (minor) issue is an unnecessary check of trx != NULL here:

@@ -2536,8 +2536,7 @@ row_upd_del_mark_clust_rec(
     }
 #ifdef WITH_WSREP
     trx_t* trx = thr_get_trx(thr) ;
-
-    if (err == DB_SUCCESS && !referenced && trx && wsrep_on(trx->mysql_thd) &&
+    if (err == DB_SUCCESS && !referenced && trx && wsrep_on_trx(trx) &&
         !(parent && que_node_get_type(parent) == QUE_NODE_UPDATE &&
           ((upd_node_t*)parent)->cascade_node == node)           &&
         foreign

There always must be a trx object when InnoDB is accessing a table. thr_get_trx(thr) should never return NULL. You can have trx without thr, but not vice versa.

Comment by Sachin Setiya (Inactive) [ 2017-09-27 ]

Hi marko!

Thanks for the review. Patch pushed into 10.1.

Comment by Sachin Setiya (Inactive) [ 2017-09-27 ]

http://lists.askmonty.org/pipermail/commits/2017-September/011512.html

Generated at Thu Feb 08 08:08:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.