Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.2.2, 10.2(EOL), 10.3(EOL), 10.4(EOL)
Description
In MDEV-30009 we reproduced and analyzed a data corruption scenario that has affected some users and customers.
The scenario is as follows (adapting from MDEV-32044):
- The InnoDB change buffer was enabled, and some INSERT operations for page 0x60bb were buffered in the past. This was the default until the change buffer was disabled by default in
MDEV-27734. - DROP INDEX was executed on this index while buffered changes existed. The entries in the change buffer became stale at this point. This is how Heikki Tuuri designed it; he was a friend of lazy execution at that time.
- CREATE INDEX or ALTER TABLE…ADD INDEX was executed on the table, using a server version this bug had not been fixed. Page 0x60bb was allocated for this index. Heikki Tuuri’s design was such that any stale change buffer entries would be removed at the time when a page is reallocated for something else. This was missed by those who implemented and reviewed an optimization of InnoDB ALTER TABLE in MySQL 5.7, which was applied to MariaDB 10.2.2 without any review.
- (the rest is different from
MDEV-32044): Some changes were buffered for the newly created index page 0x60bb. - At some later point, the stale change buffer entries were wrongly applied to the page 0x60bb in addition to the newly written entries. This would corrupt the secondary index and could cause a crash on change buffer merge, typically due to an unexpected page overflow.
Notes:
- Before
MDEV-19514was implemented in 10.5, a change buffer merge could happen at any point of time, even when the affected table was not being accessed by higher-level operations. - Before
MDEV-13542and some related changes were implemented in MariaDB Server 10.6, corruption detected during change buffer merge would lead to a crash. This would also happen during CHECK TABLE. (See MDEV-28349.)
The bug in index creation is that ibuf_set_bitmap_for_bulk_load() would fail to invoke ibuf_delete_recs() to remove any stale buffered entries for the page that is being reused for the being-created secondary index. When MDEV-30009 was fixed in MariaDB Server 10.5.19 and later, we forgot to fix this in older major versions of MariaDB Server.
This ticket is about porting the fix of ibuf_set_bitmap_for_bulk_load() to MariaDB Server 10.4.
Attachments
Issue Links
- relates to
-
MDEV-20934 Infinite loop on innodb_fast_shutdown=0 with inconsistent change buffer
-
- Closed
-
-
MDEV-29694 Remove the InnoDB change buffer
-
- Closed
-
-
MDEV-30009 InnoDB shutdown hangs when the change buffer is corrupted
-
- Closed
-
-
MDEV-32044 Mariadb crash after upgrading to 11.0.3: Failing assertion: local_len >= BTR_EXTERN_FIELD_REF_SIZE
-
- Closed
-
-
MDEV-36219 InnoDB: ibuf cursor restoration fails! ibuf record inserted to page
-
- Closed
-
We believe we've been hit by this bug when upgrading to 10.4.31. We saw one crash on an specific table with:
Sep 02 15:32:14 db1128 mysqld[1952629]: InnoDB: Failing assertion: !cursor->index->is_committed()
Sep 02 15:32:14 db1128 mysqld[1952629]: InnoDB: We intentionally generate a memory trap.
Sep 02 15:32:14 db1128 mysqld[1952629]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
Sep 02 15:32:14 db1128 mysqld[1952629]: InnoDB: If you get repeated assertion failures or crashes, even
Sep 02 15:32:14 db1128 mysqld[1952629]: InnoDB: immediately after the mysqld startup, there may be
Sep 02 15:32:14 db1128 mysqld[1952629]: InnoDB: corruption in the InnoDB tablespace. Please refer to
Sep 02 15:32:14 db1128 mysqld[1952629]: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
Sep 02 15:32:14 db1128 mysqld[1952629]: InnoDB: about forcing recovery.
Sep 02 15:32:14 db1128 mysqld[1952629]: 230902 15:32:14 [ERROR] mysqld got signal 6 ;
Sep 02 15:32:14 db1128 mysqld[1952629]: This could be because you hit a bug. It is also possible that this binary
Sep 02 15:32:14 db1128 mysqld[1952629]: or one of the libraries it was linked against is corrupt, improperly built,
Sep 02 15:32:14 db1128 mysqld[1952629]: or misconfigured. This error can also be caused by malfunctioning hardware.
Sep 02 15:32:14 db1128 mysqld[1952629]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Sep 02 15:32:14 db1128 mysqld[1952629]: We will try our best to scrape up some info that will hopefully help
Sep 02 15:32:14 db1128 mysqld[1952629]: diagnose the problem, but since we have already crashed,
Sep 02 15:32:14 db1128 mysqld[1952629]: something is definitely wrong and this may fail.
Sep 02 15:32:14 db1128 mysqld[1952629]: Server version: 10.4.31-MariaDB-log source revision: 2aea9387497cecb5668ef605b8f80886f9de812c
Sep 02 15:32:14 db1128 mysqld[1952629]: key_buffer_size=134217728
Sep 02 15:32:14 db1128 mysqld[1952629]: read_buffer_size=131072
Sep 02 15:32:14 db1128 mysqld[1952629]: max_used_connections=247
Sep 02 15:32:14 db1128 mysqld[1952629]: max_threads=2010
Sep 02 15:32:14 db1128 mysqld[1952629]: thread_count=54
Sep 02 15:32:14 db1128 mysqld[1952629]: It is possible that mysqld could use up to
Sep 02 15:32:14 db1128 mysqld[1952629]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4753830 K bytes of memory
Sep 02 15:32:14 db1128 mysqld[1952629]: Hope that's ok; if not, decrease some variables in the equation.
Sep 02 15:32:14 db1128 mysqld[1952629]: Thread pointer: 0x7f93a8001538
Sep 02 15:32:14 db1128 mysqld[1952629]: Attempting backtrace. You can use the following information to find out
Sep 02 15:32:14 db1128 mysqld[1952629]: where mysqld died. If you see no messages after this, something went
Sep 02 15:32:14 db1128 mysqld[1952629]: terribly wrong...
Sep 02 15:32:14 db1128 mysqld[1952629]: stack_bottom = 0x7ff2ac481760 thread_stack 0x30000
Sep 02 15:32:15 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(my_print_stacktrace+0x2e)[0x556be23d470e]
Sep 02 15:32:15 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(handle_fatal_signal+0x55d)[0x556be1e56c7d]
Sep 02 15:32:16 db1128 mysqld[1952629]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x13140)[0x7ff2ba678140]
Sep 02 15:32:17 db1128 mysqld[1952629]: /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x141)[0x7ff2ba1b0ce1]
Sep 02 15:32:17 db1128 mysqld[1952629]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x123)[0x7ff2ba19a537]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(+0x5c68c0)[0x556be1b348c0]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(+0x5b2f0b)[0x556be1b20f0b]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(+0xc84717)[0x556be21f2717]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(+0xc84f65)[0x556be21f2f65]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(+0xc961e5)[0x556be22041e5]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(+0xbd98ef)[0x556be21478ef]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(_ZN7handler12ha_write_rowEPKh+0x326)[0x556be1e64ea6]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(_Z12write_recordP3THDP5TABLEP12st_copy_info+0x19d)[0x556be1c12a8d]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(_Z12mysql_insertP3THDP10TABLE_LISTR4ListI4ItemERS3_IS5_ES6_S6_15enum_duplicatesb+0xb0f)[0x556be1c1d42f]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(_Z21mysql_execute_commandP3THD+0x191c)[0x556be1c49c6c]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x223)[0x556be1c50af3]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(_ZN15Query_log_event14do_apply_eventEP14rpl_group_infoPKcj+0x6dd)[0x556be1f6e03d]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(+0x628a24)[0x556be1b96a24]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(handle_slave_sql+0x1672)[0x556be1ba09a2]
Sep 02 15:32:19 db1128 mysqld[1952629]: /opt/wmf-mariadb104/bin/mysqld(+0xb77c02)[0x556be20e5c02]
Sep 02 15:32:20 db1128 mysqld[1952629]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7ff2ba66cea7]
Sep 02 15:32:22 db1128 mysqld[1952629]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7ff2ba273a2f]
Sep 02 15:32:22 db1128 mysqld[1952629]: Trying to get some variables.
Sep 02 15:32:22 db1128 mysqld[1952629]: Some pointers may be invalid and cause the dump to abort.
Sep 02 15:32:22 db1128 mysqld[1952629]: Query (0x7f93a904e695): INSERT /* MediaWiki\User\UserOptionsManager::saveOptionsInternal */ IGNORE INTO `user_properties` (up_user,up_property,up_value) VALUES (395179,'echo-subscriptions-push-ge-newcomer',''),(395179,'skin','time>
Sep 02 15:32:22 db1128 mysqld[1952629]: Connection ID (thread ID): 10
Sep 02 15:32:22 db1128 mysqld[1952629]: Status: NOT_KILLED
Sep 02 15:32:22 db1128 mysqld[1952629]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_k>
Sep 02 15:32:22 db1128 mysqld[1952629]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Sep 02 15:32:22 db1128 mysqld[1952629]: information that should help you find out what is causing the crash.
Sep 02 15:32:22 db1128 mysqld[1952629]: Writing a core file...
Sep 02 15:32:22 db1128 mysqld[1952629]: Working directory at /srv/sqldata
Sep 02 15:32:22 db1128 mysqld[1952629]: Resource Limits:
Sep 02 15:32:22 db1128 mysqld[1952629]: Limit Soft Limit Hard Limit Units
Sep 02 15:32:22 db1128 mysqld[1952629]: Max cpu time unlimited unlimited seconds
Sep 02 15:32:22 db1128 mysqld[1952629]: Max file size unlimited unlimited bytes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max data size unlimited unlimited bytes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max stack size 8388608 unlimited bytes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max core file size 0 0 bytes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max resident set unlimited unlimited bytes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max processes 2058221 2058221 processes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max open files 200001 200001 files
Sep 02 15:32:22 db1128 mysqld[1952629]: Max locked memory 65536 65536 bytes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max address space unlimited unlimited bytes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max file locks unlimited unlimited locks
Sep 02 15:32:22 db1128 mysqld[1952629]: Max pending signals 2058221 2058221 signals
Sep 02 15:32:22 db1128 mysqld[1952629]: Max msgqueue size 819200 819200 bytes
Sep 02 15:32:22 db1128 mysqld[1952629]: Max nice priority 0 0
Sep 02 15:32:22 db1128 mysqld[1952629]: Max realtime priority 0 0
Sep 02 15:32:22 db1128 mysqld[1952629]: Max realtime timeout unlimited unlimited us
Sep 02 15:32:22 db1128 mysqld[1952629]: Core pattern: /var/tmp/core/core.%h.%e.%p.%t
Sep 02 15:32:22 db1128 mysqld[1952629]: Kernel version: Linux version 5.10.0-25-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.191-1 (2023-08-16)
Sep 02 15:32:24 db1128 systemd[1]: mariadb.service: Main process exited, code=killed, status=6/ABRT
We thought it could have been a one time thing and we recloned the host. Today we saw the same crash again, with the same table
Sep 13 21:21:46 db1128 mysqld[1632625]: 2023-09-13 21:21:46 0x7ea63fea1700 InnoDB: Assertion failure in file /root/mariadb-10.4.31/storage/innobase/row/row0ins.cc line 219
Sep 13 21:21:46 db1128 mysqld[1632625]: InnoDB: Failing assertion: !cursor->index->is_committed()
Sep 13 21:21:46 db1128 mysqld[1632625]: InnoDB: We intentionally generate a memory trap.
Sep 13 21:21:46 db1128 mysqld[1632625]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
Sep 13 21:21:46 db1128 mysqld[1632625]: InnoDB: If you get repeated assertion failures or crashes, even
Sep 13 21:21:46 db1128 mysqld[1632625]: InnoDB: immediately after the mysqld startup, there may be
Sep 13 21:21:46 db1128 mysqld[1632625]: InnoDB: corruption in the InnoDB tablespace. Please refer to
Sep 13 21:21:46 db1128 mysqld[1632625]: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
Sep 13 21:21:46 db1128 mysqld[1632625]: InnoDB: about forcing recovery.
Sep 13 21:21:46 db1128 mysqld[1632625]: 230913 21:21:46 [ERROR] mysqld got signal 6 ;
Sep 13 21:21:46 db1128 mysqld[1632625]: This could be because you hit a bug. It is also possible that this binary
Sep 13 21:21:46 db1128 mysqld[1632625]: or one of the libraries it was linked against is corrupt, improperly built,
Sep 13 21:21:46 db1128 mysqld[1632625]: or misconfigured. This error can also be caused by malfunctioning hardware.
Sep 13 21:21:46 db1128 mysqld[1632625]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Sep 13 21:21:46 db1128 mysqld[1632625]: We will try our best to scrape up some info that will hopefully help
Sep 13 21:21:46 db1128 mysqld[1632625]: diagnose the problem, but since we have already crashed,
Sep 13 21:21:46 db1128 mysqld[1632625]: something is definitely wrong and this may fail.
Sep 13 21:21:46 db1128 mysqld[1632625]: Server version: 10.4.31-MariaDB-log source revision: 2aea9387497cecb5668ef605b8f80886f9de812c
Sep 13 21:21:46 db1128 mysqld[1632625]: key_buffer_size=134217728
Sep 13 21:21:46 db1128 mysqld[1632625]: read_buffer_size=131072
Sep 13 21:21:46 db1128 mysqld[1632625]: max_used_connections=226
Sep 13 21:21:46 db1128 mysqld[1632625]: max_threads=2010
Sep 13 21:21:46 db1128 mysqld[1632625]: thread_count=43
Sep 13 21:21:46 db1128 mysqld[1632625]: It is possible that mysqld could use up to
Sep 13 21:21:46 db1128 mysqld[1632625]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4753830 K bytes of memory
Sep 13 21:21:46 db1128 mysqld[1632625]: Hope that's ok; if not, decrease some variables in the equation.
Sep 13 21:21:46 db1128 mysqld[1632625]: Thread pointer: 0x7ea618000d28
Sep 13 21:21:46 db1128 mysqld[1632625]: Attempting backtrace. You can use the following information to find out
Sep 13 21:21:46 db1128 mysqld[1632625]: where mysqld died. If you see no messages after this, something went
Sep 13 21:21:46 db1128 mysqld[1632625]: terribly wrong...
Sep 13 21:21:46 db1128 mysqld[1632625]: stack_bottom = 0x7ea63fea0760 thread_stack 0x30000
Sep 13 21:21:55 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(my_print_stacktrace+0x2e)[0x5629038d670e]
Sep 13 21:21:55 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(handle_fatal_signal+0x55d)[0x562903358c7d]
Sep 13 21:22:03 db1128 mysqld[1632625]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x13140)[0x7f06271e4140]
Sep 13 21:22:11 db1128 mysqld[1632625]: /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x141)[0x7f0626d1cce1]
Sep 13 21:22:11 db1128 mysqld[1632625]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x123)[0x7f0626d06537]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(+0x5c68c0)[0x5629030368c0]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(+0x5b2f0b)[0x562903022f0b]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(+0xc84717)[0x5629036f4717]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(+0xc84f65)[0x5629036f4f65]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(+0xc961e5)[0x5629037061e5]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(+0xbd98ef)[0x5629036498ef]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(_ZN7handler12ha_write_rowEPKh+0x326)[0x562903366ea6]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(_Z12write_recordP3THDP5TABLEP12st_copy_info+0x19d)[0x562903114a8d]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(_Z12mysql_insertP3THDP10TABLE_LISTR4ListI4ItemERS3_IS5_ES6_S6_15enum_duplicatesb+0xb0f)[0x56290311f42f]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(_Z21mysql_execute_commandP3THD+0x191c)[0x56290314bc6c]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x223)[0x562903152af3]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(_ZN15Query_log_event14do_apply_eventEP14rpl_group_infoPKcj+0x6dd)[0x56290347003d]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(+0x628a24)[0x562903098a24]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(handle_slave_sql+0x1672)[0x5629030a29a2]
Sep 13 21:22:19 db1128 mysqld[1632625]: /opt/wmf-mariadb104/bin/mysqld(+0xb77c02)[0x5629035e7c02]
Sep 13 21:22:28 db1128 mysqld[1632625]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7)[0x7f06271d8ea7]
Sep 13 21:22:37 db1128 mysqld[1632625]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f0626ddfa2f]
Sep 13 21:22:37 db1128 mysqld[1632625]: Trying to get some variables.
Sep 13 21:22:37 db1128 mysqld[1632625]: Some pointers may be invalid and cause the dump to abort.
Sep 13 21:22:37 db1128 mysqld[1632625]: Query (0x7ea618f3de25): INSERT /* MediaWiki\User\UserOptionsManager::saveOptionsInternal */ IGNORE INTO `user_properties` (up_user,up_property,up_value) VALUES (20542576,'advancedsearch-disable-local-exception',''),(20542576,'betafeatures-auto-enroll-local-exception',''),(20542576,'ccmeonemails-local-exception',''),(20542576,'centralnotice-display-campaign-type-advocacy',''),(20542576,'centralnotice-display-campaign-type-advocacy-local-exception',''),(20542576,'centralnotice-display-campaign-type-article-writing-local-exception',''),(20542576,'centralnotice-display-campaign-type-event-local-exception',''),(20542576,'centralnotice-display-campaign-type-fundraising',''),(20542576,'centralnotice-display-campaign-type-fundraising-local-exception',''),(20542576,'centralnotice-display-campaign-type-governance-local-exception',''),(20542576,'centralnotice-display-campaign-type-photography-local-exception',''),(20542576,'cirrussearch-pref-completion-profile-local-exception',''),(20542576,'compact-language-links-local-exception',''),(20542576,'date-local-exception',''),(20542576,'diffonly-local-exception',''),(20542576,'disablemail-local-exception',''),(20542576,'discussiontools-autotopicsub-local-exception',''),(20542576,'discussiontools-betaenable-local-exception',''),(20542576,'discussiontools-newtopictool-createpage-local-exception',''),(20542576,'discussiontools-newtopictool-local-exception',''),(20542576,'discussiontools-replytool-local-exception',''),(20542576,'discussiontools-sourcemodetoolbar-local-exception',''),(20542576,'discussiontools-topicsubscription-local-exception',''),(20542576,'discussiontools-visualenhancements-local-exception',''),(20542576,'echo-cross-wiki-notifications',''),(20542576,'echo-cross-wiki-notifications-local-exception',''),(20542576,'echo-dont-email-read-notifications-local-exception',''),(20542576,'echo-email-format-local-exception',''),(20542576,'echo-email-frequency-local-exception',''),(20542576,'echo-notifications-page-linked-title-muted-list-local-exception',''),(20542576,'echo-subscriptions-email-ge-newcomer',''),(20542576,'echo-subscriptions-email-login-fail',''),(20542576,'echo-subscriptions-email-login-success',''),(20542576,'echo-subscriptions-push-dt-subscription-archiving',''),(20542576,'echo-subscriptions-push-ge-newcomer',''),(20542576,'echo-subscriptions-web-ge-newcomer',''),(20542576,'editfont-local-exception',''),(20542576,'email-allow-new-users-local-exception',''),(20542576,'enotifminoredits-local-exception',''),(20542576,'enotifwatchlistpages-local-exception',''),(20542576,'extendwatchlist-local-exception',''),(20542576,'forceeditsummary-local-exception',''),(20542576,'gadget-ReferenceTooltips',''),(20542576,'gadget-extra-toolbar-buttons',''),(20542576,'gender-local-exception',''),(20542576,'growthexperiments-help-panel-tog-help-panel-local-exception',''),(20542576,'growthexperiments-homepage-enable-local-exception',''),(20542576,'hidecategorization',''),(20542576,'hidecategorization-local-exception',''),(20542576,'hideminor-local-exception',''),(20542576,'imagesize-local-exception',''),(20542576,'ipinfo-beta-feature-enable-local-exception',''),(20542576,'language-local-exception',''),(20542576,'math-local-exception',''),(20542576,'multimediaviewer-enable-local-exception',''),(20542576,'norollbackdiff-local-exception',''),(20542576,'previewonfirst-local-exception',''),(20542576,'previewontop',''),(20542576,'previewontop-local-exception',''),(20542576,'rcdays-local-exception',''),(20542576,'rcenhancedfilters-disable-local-exception',''),(20542576,'rclimit-local-exception',''),(20542576,'rcshowwikidata-local-exception',''),(20542576,'requireemail-local-exception',''),(20542576,'revisionslider-disable-local-exception',''),(20542576,'searchlimit-local-exception',''),(20542576,'showhiddencats-local-exception',''),(20542576,'skin-local-exception',''),(20542576,'skin-responsive-local-exception',''),(20542576,'thumbsize-local-exception',''),(20542576,'timecorrection-local-exception',''),(20542576,'twocolconflict-local-exception',''),(20542576,'underline-local-exception',''),(20542576,'usebetatoolbar-local-exception','1'),(20542576,'usecodemirror-colorblind',''),(20542576,'usecodemirror-colorblind-local-exception',''),(20542576,'useeditwarning-local-exception',''),(20542576,'uselivepreview-local-exception',''),(20542576,'usenewrc-local-exception',''),(20542576,'vector-limited-width-local-exception',''),(20542576,'visualeditor-betatempdisable-local-exception',''),(20542576,'visualeditor-newwikitext-local-exception',''),(20542576,'watchcreations-local-exception',''),(20542576,'watchdefault-local-exception',''),(20542576,'watchlistdays-local-exception',''),(20542576,'watchlisthideanons-local-exception',''),(20542576,'watchlisthideliu-local-exception',''),(20542576,'watchlisthideminor-local-exception',''),(20542576,'watchlisthideown-local-exception',''),(20542576,'watchlistunwatchlinks-local-exception',''),(20542576,'watchmoves-local-exception',''),(20542576,'wikilove-enabled-local-exception',''),(20542576,'wlenhancedfilters-disable-local-exception',''),(20542576,'wllimit-local-exception',''),(20542576,'wlshowwikibase-local-exception',''),(20542576,'usebetatoolbar',''),(20542576,'echo-notifications-page-linked-title-muted-list-local-exception-local-exception','')
Sep 13 21:22:37 db1128 mysqld[1632625]: Connection ID (thread ID): 13616
Sep 13 21:22:37 db1128 mysqld[1632625]: Status: NOT_KILLED
Sep 13 21:22:37 db1128 mysqld[1632625]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=on,mrr_cost_based=on,mrr_sort_keys=on,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=off,condition_pushdown_from_having=on
Sep 13 21:22:37 db1128 mysqld[1632625]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Sep 13 21:22:37 db1128 mysqld[1632625]: information that should help you find out what is causing the crash.
Sep 13 21:22:37 db1128 mysqld[1632625]: Writing a core file...
Sep 13 21:22:37 db1128 mysqld[1632625]: Working directory at /srv/sqldata
Sep 13 21:22:37 db1128 mysqld[1632625]: Resource Limits:
Sep 13 21:22:37 db1128 mysqld[1632625]: Limit Soft Limit Hard Limit Units
Sep 13 21:22:37 db1128 mysqld[1632625]: Max cpu time unlimited unlimited seconds
Sep 13 21:22:37 db1128 mysqld[1632625]: Max file size unlimited unlimited bytes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max data size unlimited unlimited bytes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max stack size 8388608 unlimited bytes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max core file size 0 0 bytes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max resident set unlimited unlimited bytes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max processes 2058221 2058221 processes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max open files 200001 200001 files
Sep 13 21:22:37 db1128 mysqld[1632625]: Max locked memory 65536 65536 bytes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max address space unlimited unlimited bytes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max file locks unlimited unlimited locks
Sep 13 21:22:37 db1128 mysqld[1632625]: Max pending signals 2058221 2058221 signals
Sep 13 21:22:37 db1128 mysqld[1632625]: Max msgqueue size 819200 819200 bytes
Sep 13 21:22:37 db1128 mysqld[1632625]: Max nice priority 0 0
Sep 13 21:22:37 db1128 mysqld[1632625]: Max realtime priority 0 0
Sep 13 21:22:37 db1128 mysqld[1632625]: Max realtime timeout unlimited unlimited us
Sep 13 21:22:37 db1128 mysqld[1632625]: Core pattern: /var/tmp/core/core.%h.%e.%p.%t
Sep 13 21:22:37 db1128 mysqld[1632625]: Kernel version: Linux version 5.10.0-25-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.191-1 (2023-08-16)
Sep 13 21:22:54 db1128 systemd[1]: mariadb.service: Main process exited, code=killed, status=6/ABRT
After spending quite some time with marko today with some gdb traces, it looks like we've been hit by this bug on this table which had a DROP+CREATE index a few years ago and where we had change buffer enabled.
+------------------------+-------+----------+------------------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+------------------------+-------+----------+------------------------------------------------------------------------------+
| enwiki.user_properties | check | Warning | InnoDB: Index 'up_property' contains 270261695 entries, should be 270261639. |
| enwiki.user_properties | check | error | Corrupt |
+------------------------+-------+----------+------------------------------------------------------------------------------+
2 rows in set (9 min 57.704 sec)
Once that table was optimized, replication started to flow again, but the host crashed with the same exception, but this time with a different table, which also had a schema change a few years ago, and which has revealed to have corrupted indexes:
| Table | Op | Msg_type | Msg_text |
+------------------+-------+----------+-------------------------------------------------------------------------------------------+
| enwiki.pagelinks | check | Warning | InnoDB: Index 'pl_namespace' contains 1549475640 entries, should be 1549475630. |
| enwiki.pagelinks | check | Warning | InnoDB: Index 'pl_backlinks_namespace' contains 1549475640 entries, should be 1549475630. |
| enwiki.pagelinks | check | error | Corrupt |
+------------------+-------+----------+-------------------------------------------------------------------------------------------+
3 rows in set (2 hours 2 min 27.976 sec)
For now we are optimizing all the tables in this host.
Thanks Marko for your help here!