[MDEV-20456] MariaDB Crash during replication Created: 2019-08-30  Updated: 2020-06-04

Status: Open
Project: MariaDB Server
Component/s: Replication, Storage Engine - RocksDB
Affects Version/s: 10.2.26
Fix Version/s: 10.2

Type: Bug Priority: Major
Reporter: Alexander Feller Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 1
Labels: None
Environment:

GCP, Ubuntu 18.04.3 LTS, MariaDB 10.2.26, RocksDB



 Description   

2019-08-30 13:39:10 140297599895296 [ERROR] RocksDB: Error detected in background, Status Code: 2, Status: Corruption: block checksum mismatch: expected 4228173794, got 4228174270  in ./#rocksdb/219203.sst offset 18973184 size 16381
2019-08-30 13:39:10 140297599895296 [ERROR] RocksDB: BackgroundErrorReason: 1
2019-08-30 13:39:10 140297599895296 [Note] RocksDB: Creating the file ./#rocksdb/ROCKSDB_CORRUPTED to abort mysqld restarts. Remove this file from the data directory after fixing the corruption to recover.
190830 13:39:10 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.2.26-MariaDB-1:10.2.26+maria~bionic-log
key_buffer_size=20971520
read_buffer_size=131072
max_used_connections=2
max_threads=602
thread_count=12
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 9973526 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55d896fe2a0e]
/usr/sbin/mysqld(handle_fatal_signal+0x513)[0x55d896a5e553]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f99b773a890]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f99b6c4ee97]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f99b6c50801]
/usr/lib/mysql/plugin/ha_rocksdb.so(+0x1ab2b1)[0x7f999583c2b1]
/usr/lib/mysql/plugin/ha_rocksdb.so(_ZN7rocksdb12EventHelpers23NotifyOnBackgroundErrorERKSt6vectorISt10shared_ptrINS_13EventListenerEESaIS4_EENS_21BackgroundErrorReasonEPNS_6StatusEPNS_17InstrumentedMutexEPb+0x138)[0x7f99958cc228]
/usr/lib/mysql/plugin/ha_rocksdb.so(_ZN7rocksdb12ErrorHandler10SetBGErrorERKNS_6StatusENS_21BackgroundErrorReasonE+0x157)[0x7f99958cae97]
/usr/lib/mysql/plugin/ha_rocksdb.so(_ZN7rocksdb6DBImpl20BackgroundCompactionEPbPNS_10JobContextEPNS_9LogBufferEPNS0_19PrepickedCompactionENS_3Env8PriorityE+0x1849)[0x7f9995881569]
/usr/lib/mysql/plugin/ha_rocksdb.so(_ZN7rocksdb6DBImpl24BackgroundCallCompactionEPNS0_19PrepickedCompactionENS_3Env8PriorityE+0x12c)[0x7f99958888fc]
/usr/lib/mysql/plugin/ha_rocksdb.so(_ZN7rocksdb6DBImpl16BGWorkCompactionEPv+0x97)[0x7f9995888fe7]
/usr/lib/mysql/plugin/ha_rocksdb.so(_ZN7rocksdb14ThreadPoolImpl4Impl8BGThreadEm+0x270)[0x7f9995b24c80]
/usr/lib/mysql/plugin/ha_rocksdb.so(_ZN7rocksdb14ThreadPoolImpl4Impl15BGThreadWrapperEPv+0x56)[0x7f9995b24e56]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbd66f)[0x7f99b745c66f]
nptl/pthread_create.c:463(start_thread)[0x7f99b772f6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f99b6d3188f]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Fatal signal 11 while backtracing
2019-08-30 13:39:20 139830559913280 [ERROR] RocksDB: There was a corruption detected in RockDB files. Check error log emitted earlier for more details.
2019-08-30 13:39:20 139830559913280 [ERROR] RocksDB: The server will exit normally and stop restart attempts. Remove ./#rocksdb/ROCKSDB_CORRUPTED file from data directory and start mysqld manually.



 Comments   
Comment by Jonas Krauss [ 2019-09-19 ]

I had a similar problem during the last few days. I have tried out version 10.4.7, 10.3.17 and 10.2.27, they all share the same MyRocks version and all seem to be affected. I can provide more logs etc. on request, please let me know what you need in case that's relevant.

Thanks

Comment by Jonas Krauss [ 2019-09-23 ]

I have a slave running MariaDB 10.2.25 (last version before the MyRocks upstream merge) replicate from our master for the last 72 hours without any problem. This is exactly the same setup as when the corruption occurred, only MariaDB versions differ. I think the chance is high that there is some kind of incompatibility between RocksDB 5.14.0 and 6.2.0, at least when run under the MariaDB hood in replication. Would be great if this matter could be investigated further, as otherwise we seem to be stuck on the last MariaDB release before the upstream merge. I am happy to assist if possible.

Comment by Jonas Krauss [ 2020-06-04 ]

Digging up this old issue as we are still struggling to migrate past 10.2.25.

Some more observation points to a specific problem with longtext columns as corruption occurs when replicating a row for this type. This is the table info, maybe it helps:

CREATE TABLE `articles` (
`article_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`generator_type` varchar(50) NOT NULL DEFAULT '',
`generator_name` varchar(100) DEFAULT NULL,
`article_outline` varchar(255) NOT NULL,
`article_text` longtext NOT NULL,
`data` longtext DEFAULT NULL,
`language` varchar(2) NOT NULL DEFAULT 'de',
`created_at` datetime NOT NULL DEFAULT current_timestamp(),
PRIMARY KEY (`article_id`),
KEY `articles_generator_type_IDX` (`generator_type`,`created_at`) USING BTREE,
KEY `created_at` (`created_at`),
KEY `language` (`language`),
KEY `articles_language_IDX` (`language`,`generator_type`,`created_at`) USING BTREE
) ENGINE=ROCKSDB AUTO_INCREMENT=77860925 DEFAULT CHARSET=utf8

Generated at Thu Feb 08 08:59:36 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.