[MDEV-24578] MariaDB 10.5 fails to join Galera cluster of MariaDB 10.1, root cause SST failure, symptom is InnoDB: Assertion failure Created: 2021-01-12  Updated: 2021-12-23  Resolved: 2021-12-23

Status: Closed
Project: MariaDB Server
Component/s: Galera SST, Storage Engine - InnoDB
Affects Version/s: 10.5
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Otto Kekäläinen Assignee: Jan Lindström (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None

Attachments: Text File README.txt     File gdb-63031.log     File gdb-6520.log     Text File journald-mariadb.txt    
Issue Links:
Relates
relates to MDEV-13542 Crashing on a corrupted page is unhel... Closed
relates to MDEV-24284 InnoDB: Record field 64 len 18446744... Closed
relates to MDEV-24162 InnoDB: Assertion failure in file /ho... Closed
relates to MDEV-24412 Mariadb 10.5: InnoDB: Upgrade after a... Closed

 Description   

(This was originally filed as https://jira.mariadb.org/browse/MDEV-24162, but mistakenly marked as duplicate of MDEV-23672, which it is not)

While trying to join an existing Galera cluster of two MariaDB 10.1.48 nodes with a fresh install of MariaDB 10.5.8, the maria-backup part runs fine but once all files have been transferred and the mariadbd tries to start it fails on:

tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Evaluating /usr//bin/mariabackup  --move-back --no-version-check   --move-back --force-non-empty-directories --target-dir=${DATA} &> /var/lib/mysql//mariabackup.move.log (20210112 20:20:51.467)
tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Move successful, removing /var/lib/mysql//.sst (20210112 20:20:51.742)
tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Galera co-ords from recovery: 7f74fdea-cda3-11e9-ae2f-ab188b9a01f2:339777552 (20210112 20:20:51.752)
tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Total time on joiner: 0 seconds (20210112 20:20:51.756)
tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Removing the sst_in_progress file (20210112 20:20:51.759)
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 3 [Note] WSREP: SST received
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 3 [Note] WSREP: Server status change joiner -> initializing
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Uses event mutexes
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Number of pools: 1
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Using Linux native AIO
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Initializing buffer pool, total size = 2147483648, chunk size = 134217728
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Completed initialization of buffer pool
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 100663296 bytes
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: New log file created, LSN=1586283788623
tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: 128 rollback segments are active.
tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: Creating sys_virtual system tables.
tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: Creating shared tablespace for temporary tables
tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: 10.5.9 started; log sequence number 0; transaction id 8247886884634323262
tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0x7fd1737fe700  InnoDB: Assertion failure in file /build/mariadb-10.5-onlwrq/mariadb-10.5-10.5.9+maria~bionic~ubuntu18.04.1~1610445096.8de233af815.10.5/storage/innobase/include/fut0lst.h line 127
tammi 12 20:20:52 mariadbd[63031]: InnoDB: Failing assertion: addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA
tammi 12 20:20:52 mariadbd[63031]: InnoDB: We intentionally generate a memory trap.
tammi 12 20:20:52 mariadbd[63031]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
tammi 12 20:20:52 mariadbd[63031]: InnoDB: If you get repeated assertion failures or crashes, even
tammi 12 20:20:52 mariadbd[63031]: InnoDB: immediately after the mysqld startup, there may be
tammi 12 20:20:52 mariadbd[63031]: InnoDB: corruption in the InnoDB tablespace. Please refer to
tammi 12 20:20:52 mariadbd[63031]: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
tammi 12 20:20:52 mariadbd[63031]: InnoDB: about forcing recovery.
tammi 12 20:20:52 mariadbd[63031]: 210112 20:20:52 [ERROR] mysqld got signal 6 ;

Because the daemon is managed by systemd, it automatically tries to restart. The second start fails on a slightly different error:

tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Page [page id: space=0, page number=975] log sequence number 1587610736226 is in the future! Current system log sequence number 1586283836448.
tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Page [page id: space=0, page number=977] log sequence number 1586526112239 is in the future! Current system log sequence number 1586283836448.
tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0x7f3eeaf0d800  InnoDB: Assertion failure in file /build/mariadb-10.5-onlwrq/mariadb-10.5-10.5.9+maria~bionic~ubuntu18.04.1~1610445096.8de233af815.10.5/storage/innobase/dict/dict0load.cc line 1816
tammi 12 20:21:03 sh[6248]: InnoDB: Failing assertion: btr_pcur_is_on_user_rec(&pcur)

Any further restarts repeat this latter error.

Full journald output from both runs attached.

I have also produced core dumps of both crashes and they are attached.

In first process 63031 there is:

#10 0x0000559d3cbdb3fe in ut_dbg_assertion_failed (expr=expr@entry=0x559d3d712a10 "addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA", file=file@entry=0x559d3d712980 "/build/mariadb-10.5-onlwrq/mariadb-10.5-10.5.9+maria~bionic~ubuntu18.04.1~1610445096.8de233af815.10.5/storage/innobase/include/fut0lst.h", line=line@entry=127) at ./storage/innobase/ut/ut0dbg.cc:60
No locals.
#11 0x0000559d3d2f7756 in flst_read_addr (faddr=<optimized out>) at ./storage/innobase/include/fut0lst.h:127
        addr = <optimized out>
        addr = <optimized out>
#12 flst_get_prev_addr (node=<optimized out>) at ./storage/innobase/include/fut0lst.h:153
No locals.
#13 trx_purge_rseg_get_next_history_log (n_pages_handled=n_pages_handled@entry=0x7fd1737fdc78) at ./storage/innobase/trx/trx0purge.cc:856
        prev_log_addr = <optimized out>
        trx_no = <optimized out>
        mtr = {m_last = 0x0, m_last_offset = 0, m_log_mode = 0, m_modifications = 0, m_made_dirty = 0, m_inside_ibuf = 0, m_trim_pages = 0, m_memo = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7fd1737fd278, prev = 0x7fd1737fd278}}, size_ = 1}, m_size = 16, m_first_block = {<ilist_node<void>> = {next = 0x7fd1737fd258, prev = 0x7fd1737fd258}, m_data = "\000l\000\004\322\177\000\000\001", '\000' <repeats 490 times>, m_used = 16}}, m_log = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7fd1737fd4a8, prev = 0x7fd1737fd4a8}}, size_ = 1}, m_size = 0, m_first_block = {<ilist_node<void>> = {next = 0x7fd1737fd488, prev = 0x7fd1737fd488}, m_data = '\000' <repeats 496 times>..., m_used = 0}}, m_user_space = 0x0, m_commit_lsn = 0, m_freed_space = 0x0, m_freed_pages = 0x0}
        undo_page = <optimized out>
        log_hdr = <optimized out>
        empty = <optimized out>
        purge = <optimized out>

The second process has:

#6  0x000055a9f78373fe in ut_dbg_assertion_failed (expr=expr@entry=0x55a9f83a20e8 "btr_pcur_is_on_user_rec(&pcur)", file=file@entry=0x55a9f83a1848 "/build/mariadb-10.5-onlwrq/mariadb-10.5-10.5.9+maria~bionic~ubuntu18.04.1~1610445096.8de233af815.10.5/storage/innobase/dict/dict0load.cc", line=line@entry=1816) at ./storage/innobase/ut/ut0dbg.cc:60
No locals.
#7  0x000055a9f800d623 in dict_load_columns (heap=0x55a9fc21f518, table=0x55a9fc20f480) at ./storage/innobase/dict/dict0load.cc:1816
        err_msg = <optimized out>
        name = 0x0
        nth_v_col = 18446744073709551615
        sys_columns = <optimized out>

The failure happened originally on 10.5.8. I created new builds based on latest 10.5 HEAD with debug symbols etc to ensure that this bug is in fact not fixed in git yet, but exists and if 10.5.9 was released today, it would still fail like this. The attached README.txt includes notes on how I created the debug build with symbols and installed it etc procedure.



 Comments   
Comment by Marko Mäkelä [ 2021-01-13 ]

In the log, we see that the ‘nuclear option’ has been used to switch the server to ‘garbage in, garbage out’ mode:

tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Page [page id: space=0, page number=975] log sequence number 1587610736226 is in the future! Current system log sequence number 1586283836448.

The write-ahead log (redo log) is the glue that guarantees the atomicity and durability of operations in the buffer pool. For example, let us consider a single-statement UPDATE transaction, updating an indexed column. It would consist of the following mini-transactions that atomically change a bunch of pages:

  1. Allocate an undo log page.
  2. Write an undo log record about the update, containing the current values of DB_TRX_ID,DB_ROLL_PTR in the record, as well as the PRIMARY KEY and the secondary key.
  3. Update the DB_TRX_ID,DB_ROLL_PTR in the clustered index record with the current transaction, and and a pointer to the undo log record that was written.
  4. Delete-mark the index record (old_key,primary_key).
  5. Insert the index record (new_key,primary_key).

(Starting with MariaDB 10.3, the first two mini-transactions should be a single mini-transaction.)

The buffer pool is a cache of the data files. It is not a write-through cache. Typically, outstanding modifications could reside in the buffer pool for a long time. Only when a log checkpoint is performed, everything until the new checkpoint LSN must be written back to the data files. Between checkpoints, pages can be written in any order. For example, it could happen that the clustered index page is written to the data file before the undo log page has been written. If the server were killed, the ib_logfile* deleted and the server restarted, then the index and undo pages could be out of sync with each other.

The log file is the glue that guarantees that the server will recover to a particular logical point of time. If that glue is removed, the file system could have copies of pages corresponding to different points of time.

The system tablespace pages 975 and 977 that are mentioned in your "is in the future" output could be undo log pages. Because apparently no such message was issued for the clustered index page, it seems possible that the undo log page corresponds to a newer point of time than the index page. As you can imagine, it is possible that the entire undo log record for the old modification had been purged, freed and reused for something else. Now, it could be the case that the 10.5 server is finding the old index page again and trying to execute purge again. But, the purge had actually been executed in the past! Only, for some reason, the log file as well as the modifications to the index page were lost.

Let me ask you how this corruption was reached:

  • Was the data directory copied with some conventional tools (cp, tar, rsync) while the 10.1 server was running?
  • Were conventional tools used, on a file system snapshot that does not work as advertised?
  • Or, were the files copied properly (while the server is not running, or by using mariabackup or xtrabackup)?

Also, how exactly was the upgrade performed?

  • Did the 10.5 server refuse to start up, saying InnoDB: Upgrade after a crash is not supported?
  • And then, instead of starting and shutting down the 10.1 server once again (or executing mariabackup --prepare with the 10.1 version), someone ‘fixed’ the problem by deleting the logs, or by setting innodb_force_recovery=6, as described in MDEV-24412?

And last but not least, how would you suggest us to improve our documentation?

Comment by Otto Kekäläinen [ 2021-01-13 ]

The message log sequence number 1587610736226 is in the future is from the restart after the first crash. You should focus on reading the gdb-63031.log. That is where the first crash happens. It is quite understandable that the restart fails (visible in the gdb-6501.log) as the mariadbd crashed and left the files in an inconsistent state. I only provided gdb-6501.log as addiitonal information in case it helps shed some additional light into the first crash and the state after it.

> Let me ask you how this corruption was reached

1. MariaDB 10.5 is installed and configures in stand-alone mode (wsrep_on=OFF in config). Everything works.
2. The wsrep_on=OFF is removed, and server started with systemctl restart mariadb
3. Server starts, joins cluster (configured to use maria-backup to transfer data. What happens is visible in journald-mariadb.txt

WSREP_SST: [INFO] Waiting for SST streaming to complete!
..
WSREP_SST: [INFO] Evaluating /usr//bin/mariabackup ...
...
WSREP: SST received
...
InnoDB: 10.5.9 started; log sequence number 0; transaction id 8247886884634323262
...
0x7fd1737fe700 InnoDB: Assertion failure
...

CRASH

These lines indicate the restart:
tammi 12 20:20:59 systemd[1]: Stopped MariaDB 10.5.9 database server.
tammi 12 20:20:59 systemd[1]: Starting MariaDB 10.5.9 database server...

Note that journald does not guarantee the order of each exact line. Thus the crash+stacktrace is not the last output. Or maybe the crash is from one thread and the rest of the mariadbd outputs stuff before is shuts down.

> Also, how exactly was the upgrade performed?

Nothing special. Just `systemctl restart mariadb` -> join starts with SST from donor.

> .. someone ‘fixed’ the problem by deleting the logs ?

Nobody did anything. Only wsrep_on=ON and restart server. The donor 10.1 runs all the time and it does not crash. The crash is in 10.5.9 and it is unable to run.

Comment by Otto Kekäläinen [ 2021-01-13 ]

marko Just to be clear, nobody set `innodb_force_recovery=6` nor deleted logs or did anything else manually. If by looking at the gdb stack trace or journald log it looks as if somebody did, it must be the server itself, which sounds pretty serious.

Comment by Otto Kekäläinen [ 2021-01-13 ]

Seems that InnoDB assertion failures are common error messages that don't actually tell what has gone wrong (https://jira.mariadb.org/browse/MDEV-13542). The root cause is something that corrupted the data, and the assertion error is just the first one that triggers on bad data.

Comment by Otto Kekäläinen [ 2021-01-13 ]

OK, this is not a 10.5.7/8 regression. The server fails to start also with MariaDB 10.5.6 (although there are no assertion errors in 10.5.6):

tammi 13 09:53:37 mariadbd[30231]: WSREP_SST: [INFO] Waiting for SST streaming to complete! (20210113 09:53:37.248)
tammi 13 09:53:38 mariadbd[30231]: 2021-01-13  9:53:38 0 [Note] WSREP: (d21d081b-be02, 'tcp://0.0.0.0:4567') turning message relay requesting off
tammi 13 09:57:56 mariadbd[30231]: 2021-01-13  9:57:56 0 [Note] WSREP: 0.0 (SE-B1): State transfer to 2.0 (SE-B4) complete.
tammi 13 09:57:56 mariadbd[30231]: 2021-01-13  9:57:56 0 [Note] WSREP: Member 0.0 (SE-B1) synced with group.
tammi 13 09:57:56 mariadbd[30231]: WSREP_SST: [INFO] Preparing the backup at /var/lib/mysql//.sst (20210113 09:57:56.895)
tammi 13 09:57:56 mariadbd[30231]: WSREP_SST: [INFO] Evaluating /usr//bin/mariabackup --prepare --no-version-check  $INNOEXTRA --target-dir=${DATA} --mysqld-args $WSREP_SST_OPT_MYSQLD &> /var/lib/mysql//mariabackup.prepare.log (20210113 09:57:56.897)
tammi 13 09:57:57 mariadbd[30231]: WSREP_SST: [INFO] Moving the backup to /var/lib/mysql/ (20210113 09:57:57.685)
tammi 13 09:57:57 mariadbd[30231]: WSREP_SST: [INFO] Evaluating /usr//bin/mariabackup  --move-back --no-version-check   --move-back --force-non-empty-directories --target-dir=${DATA} &> /var/lib/mysql//mariabackup.move.log (20210113 09:57:57.689)
tammi 13 09:57:57 mariadbd[30231]: WSREP_SST: [INFO] Move successful, removing /var/lib/mysql//.sst (20210113 09:57:57.967)
tammi 13 09:57:57 mariadbd[30231]: WSREP_SST: [INFO] Galera co-ords from recovery: 7f74fdea-cda3-11e9-ae2f-ab188b9a01f2:340490612 (20210113 09:57:57.977)
tammi 13 09:57:57 mariadbd[30231]: WSREP_SST: [INFO] Total time on joiner: 0 seconds (20210113 09:57:57.980)
tammi 13 09:57:57 mariadbd[30231]: WSREP_SST: [INFO] Removing the sst_in_progress file (20210113 09:57:57.983)
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 3 [Note] WSREP: SST received
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 3 [Note] WSREP: Server status change joiner -> initializing
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Using Linux native AIO
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Uses event mutexes
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Number of pools: 1
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Using SSE4.2 crc32 instructions
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Initializing buffer pool, total size = 2147483648, chunk size = 134217728
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Completed initialization of buffer pool
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 100663296 bytes
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [Note] InnoDB: New log file created, LSN=1586283788623
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [ERROR] InnoDB: Page [page id: space=0, page number=7] log sequence number 1591467353692 is in the future! Current system log sequence number 1586283788824.
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [ERROR] InnoDB: Page [page id: space=0, page number=2] log sequence number 1591567002974 is in the future! Current system log sequence number 1586283788824.
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [ERROR] InnoDB: Page [page id: space=0, page number=4] log sequence number 1591566428857 is in the future! Current system log sequence number 1586283788824.
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [ERROR] InnoDB: Page [page id: space=0, page number=11] log sequence number 1588725656509 is in the future! Current system log sequence number 1586283788824.
...
tammi 13 09:57:58 mariadbd[30231]: 2021-01-13  9:57:58 0 [ERROR] [FATAL] InnoDB: Trying to read [page id: space=0, page number=1869377383] which is outside the bounds of tablespace innodb_system. Byte offset 0, len 16384
tammi 13 09:57:58 mariadbd[30231]: 210113  9:57:58 [ERROR] mysqld got signal 6 ;

I have been able to upgrade a MariaDB 10.1.x Galera cluster to MariaDB 10.5.6 before on a near identical setup.

So in this case maybe maria-backup is doing something funny while transfering the files. Looking at the data dirs after maria-backup is done with SST they seem to be ok, although on the donor side there are ib_logfile0 (48 MB) and ib_logfile1 (48 MB) while the receiving MariaDB 10.5.6 side has just one ib_logfile0 (96 MB). Both have a ibdata1 with equal size, but on the receiving side the timestamp is 5 minutes older.

Comment by Marko Mäkelä [ 2021-01-13 ]

otto, if the logs really were not wrongly discarded at any point, then this "is in the future" problem is a bug in the backup workflow. Can you get more information on that?

InnoDB crashing on corruption is a design issue (or you might even say, attitude problem), and hopefully some day MDEV-13542 will be fixed. It would be a major undertaking. I would compare this to the huge transition that was made at Microsoft Windows around the time of XP Service Pack 2, when security finally started to trump usability. It is difficult to rewrite a huge code base when all of it is littered with assertions that would kill the server on the slightest corruption.

The line of thinking during InnoDB development has been that error handling is hard, and it is difficult to cover all errors. Part of this thinking was that users should have properly working backups (and that there would no bugs in backup tools; see MDEV-24449 for a recent finding).

Comment by Otto Kekäläinen [ 2021-01-13 ]

Just to be sure, I checked that these system settings are identical on both donor and receiver:

210113 10:35:38 Connecting to MySQL server host: localhost, user: xtrabackup, password: set, port: 3306, socket: /var/run/mysqld/mysqld.sock
Using server version 10.1.48-MariaDB-1~bionic
--innobackupex based on MariaDB server 10.1.48-MariaDB debian-linux-gnu (x86_64)
mariabackup: uses posix_fadvise().
mariabackup: cd to /var/lib/mysql/
mariabackup: open files limit requested 131070, set to 131070
mariabackup: using the following InnoDB configuration:
mariabackup:   innodb_data_home_dir =
mariabackup:   innodb_data_file_path = ibdata1:12M:autoextend
mariabackup:   innodb_log_group_home_dir = ./
mariabackup:   innodb_log_files_in_group = 2
mariabackup:   innodb_log_file_size = 50331648
mariabackup: using O_DIRECT

It did not help however.

Also I see that InnoDB does align the log sizes on the fly anyway:

mariadbd[63536]: 2021-01-13 10:49:53 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 50331648 bytes
mariadbd[63536]: 2021-01-13 10:49:53 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
mariadbd[63536]: 2021-01-13 10:49:53 0 [Note] InnoDB: New log file created, LSN=1586283788623
mariadbd[63536]: 2021-01-13 10:49:53 0 [ERROR] InnoDB: Page [page id: space=0, page number=7] log sequence number 1591613047690

Comment by Marko Mäkelä [ 2021-01-13 ]

I implemented the log resizing (rebuild) on server startup in MySQL 5.6. That should be safe; at least I am not aware of any problem with that.

I have not reviewed the Galera snapshot transfer scripts. I hope that they are not ignoring any nonzero exit status from mariabackup, or that mariabackup did not wrongly return zero exit status even after encountering an error. (We have had bugs of the latter form in the past.) Can you get complete logs of all mariabackup invocations?

Comment by Otto Kekäläinen [ 2021-01-13 ]

Ok I solved this now.

The root cause was that apparently SST with maria-backup from 10.1->10.5 does not work, it does not properly run the prepare etc steps or maybe it just the logic that InnoDB has to upgrade redo logs etc.

Solution:
1. Remove 10.5
2. Install 10.1, join the cluster of other 10.1 instances
3. Install 10.5 on top of 10.1 so it does an in-place upgrade (by InnoDB and mariadb-upgrade without any maria-backup involved) and immediately syncs to the cluster back using IST
4. Profit!

The key here is to do it quickly. If too much time passes and IST can't catch up and the joiner starts running SST it will always fail.

Comment by Marko Mäkelä [ 2021-01-14 ]

otto, to narrow down the cause further, I would suggest that you try to point out the problematic steps or omissions in the SST scripts. Ideally, a patched script would be good.

I have a few remarks that may be useful for ensuring that this will be fixed by our Galera team:

  1. The backup must be both made and prepared by the same major version of mariabackup as the server.
  2. It is not allowed to ignore any errors.
  3. It would be good if the complete output of the mariabackup invocations were made available somewhere, at least as an option.
  4. When upgrading from an earlier major release than 10.3 to 10.3 or later, the undo log format change that was made in MDEV-12288 should be accounted for, as noted in MDEV-23755, to avoid hitting MDEV-15912 after the upgrade. I think that this could technically be done in the mariabackup --prepare a.k.a. innobackupex --apply-log step. In 10.2, mariabackup --prepare --rollback-xa should be sufficient (but somewhat overkill). In 10.1, I guess that we must cross our fingers and hope that any recovered transactions will have been rolled back.

In this reported case, I do not think that we encountered the last point, but I want to mention it for completeness.

Comment by Jan Lindström (Inactive) [ 2021-01-14 ]

otto Does it work if you do Galera rolling upgrade 10.1->10.2->10.3->10.4->10.5 so not directly 10.1->10.5 ?. To further analyze it would be nice to have mariabackup logs.

Comment by Otto Kekäläinen [ 2021-01-17 ]

marko It seems to me that the SST join process never runs `mysql_upgrade` and thus InnoDB fails on assertion error. When the IST runs there are also `Table mysql/innodb_index_stats has length mismatch in the column name table_name.` warnings right up until the upgrade starts:

/etc/mysql/debian-start[7087]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
/etc/mysql/debian-start[7087]: Looking for 'mysql' as: /usr/bin/mysql
/etc/mysql/debian-start[7087]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
/etc/mysql/debian-start[7087]: Phase 1/7: Checking and upgrading mysql database
/etc/mysql/debian-start[7087]: Running 'mysqlcheck' with connection arguments: --port='3306' --
socket='/var/run/mysqld/mysqld.sock'
/etc/mysql/debian-start[7087]: # Connecting to localhost...
/etc/mysql/debian-start[7087]: # Disconnecting from localhost...
/etc/mysql/debian-start[7087]: Processing databases

jplindst Trying an 10.1->10.2->10.3->10.4->10.5 upgrade would be waste of time. I think we can just conclude that SST is not a supported upgrade method, but with IST even the 10.1 -> 10.5 upgrade works (as the result of this experience was).

Comment by Otto Kekäläinen [ 2021-01-18 ]

In my opinion, to close this bug we should either

1) Documentation: Emit some warning / fatal error that a SST from a major older version is not possible so that the process stops with a sensible error message and does not proceed to crashing InnoDB (and cause "false" alarms and people doing gdb traces in vain).

OR

2) Upgrade proofing: Alternatively make the Galera join process detect that the SST is from an older version and trigger a mariadb-upgrade run before InnoDB starts.

Comment by Jan Lindström (Inactive) [ 2021-01-19 ]

The supported method to do upgrades is documented e.g. here https://mariadb.com/kb/en/upgrading-from-mariadb-101-to-mariadb-102-with-galera-cluster/ . SST is not a upgrade method because it just copies database state from donor to joiner. This database state copy means all tables including mysql database. If tables are not compatible for server version on joiner there is not much we could do.

Comment by Christopher Yeleighton [ 2021-02-13 ]

2021-02-13 14:49:02 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=452120869,452120869
2021-02-13 14:49:03 0 [Note] InnoDB: 128 rollback segments are active.
2021-02-13 14:49:03 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2021-02-13 14:49:03 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2021-02-13 14:49:03 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2021-02-13 14:49:03 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
2021-02-13 14:49:03 0 [Note] InnoDB: 10.5.8 started; log sequence number 452125597; transaction id 340175
2021-02-13 14:49:03 0x7fffdaffd640 InnoDB: Assertion failure in file /home/abuild/rpmbuild/BUILD/mariadb-10.5.8/storage/innobase/include/fut0lst.h line 127
InnoDB: Failing assertion: addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA

Also, assertions are not the right way to deal with inconsistent data. The purpose of assertions is to catch logic errors in the development phase.

Generated at Thu Feb 08 09:31:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.