Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24578

MariaDB 10.5 fails to join Galera cluster of MariaDB 10.1, root cause SST failure, symptom is InnoDB: Assertion failure

Details

    Description

      (This was originally filed as https://jira.mariadb.org/browse/MDEV-24162, but mistakenly marked as duplicate of MDEV-23672, which it is not)

      While trying to join an existing Galera cluster of two MariaDB 10.1.48 nodes with a fresh install of MariaDB 10.5.8, the maria-backup part runs fine but once all files have been transferred and the mariadbd tries to start it fails on:

      tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Evaluating /usr//bin/mariabackup  --move-back --no-version-check   --move-back --force-non-empty-directories --target-dir=${DATA} &> /var/lib/mysql//mariabackup.move.log (20210112 20:20:51.467)
      tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Move successful, removing /var/lib/mysql//.sst (20210112 20:20:51.742)
      tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Galera co-ords from recovery: 7f74fdea-cda3-11e9-ae2f-ab188b9a01f2:339777552 (20210112 20:20:51.752)
      tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Total time on joiner: 0 seconds (20210112 20:20:51.756)
      tammi 12 20:20:51 mariadbd[63031]: WSREP_SST: [INFO] Removing the sst_in_progress file (20210112 20:20:51.759)
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 3 [Note] WSREP: SST received
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 3 [Note] WSREP: Server status change joiner -> initializing
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Uses event mutexes
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Number of pools: 1
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Using Linux native AIO
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Initializing buffer pool, total size = 2147483648, chunk size = 134217728
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Completed initialization of buffer pool
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 100663296 bytes
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
      tammi 12 20:20:51 mariadbd[63031]: 2021-01-12 20:20:51 0 [Note] InnoDB: New log file created, LSN=1586283788623
      tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: 128 rollback segments are active.
      tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: Creating sys_virtual system tables.
      tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: Creating shared tablespace for temporary tables
      tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
      tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0 [Note] InnoDB: 10.5.9 started; log sequence number 0; transaction id 8247886884634323262
      tammi 12 20:20:52 mariadbd[63031]: 2021-01-12 20:20:52 0x7fd1737fe700  InnoDB: Assertion failure in file /build/mariadb-10.5-onlwrq/mariadb-10.5-10.5.9+maria~bionic~ubuntu18.04.1~1610445096.8de233af815.10.5/storage/innobase/include/fut0lst.h line 127
      tammi 12 20:20:52 mariadbd[63031]: InnoDB: Failing assertion: addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA
      tammi 12 20:20:52 mariadbd[63031]: InnoDB: We intentionally generate a memory trap.
      tammi 12 20:20:52 mariadbd[63031]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      tammi 12 20:20:52 mariadbd[63031]: InnoDB: If you get repeated assertion failures or crashes, even
      tammi 12 20:20:52 mariadbd[63031]: InnoDB: immediately after the mysqld startup, there may be
      tammi 12 20:20:52 mariadbd[63031]: InnoDB: corruption in the InnoDB tablespace. Please refer to
      tammi 12 20:20:52 mariadbd[63031]: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
      tammi 12 20:20:52 mariadbd[63031]: InnoDB: about forcing recovery.
      tammi 12 20:20:52 mariadbd[63031]: 210112 20:20:52 [ERROR] mysqld got signal 6 ;
      

      Because the daemon is managed by systemd, it automatically tries to restart. The second start fails on a slightly different error:

      tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Page [page id: space=0, page number=975] log sequence number 1587610736226 is in the future! Current system log sequence number 1586283836448.
      tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
      tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Page [page id: space=0, page number=977] log sequence number 1586526112239 is in the future! Current system log sequence number 1586283836448.
      tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
      tammi 12 20:21:03 sh[6248]: 2021-01-12 20:21:00 0x7f3eeaf0d800  InnoDB: Assertion failure in file /build/mariadb-10.5-onlwrq/mariadb-10.5-10.5.9+maria~bionic~ubuntu18.04.1~1610445096.8de233af815.10.5/storage/innobase/dict/dict0load.cc line 1816
      tammi 12 20:21:03 sh[6248]: InnoDB: Failing assertion: btr_pcur_is_on_user_rec(&pcur)
      

      Any further restarts repeat this latter error.

      Full journald output from both runs attached.

      I have also produced core dumps of both crashes and they are attached.

      In first process 63031 there is:

      #10 0x0000559d3cbdb3fe in ut_dbg_assertion_failed (expr=expr@entry=0x559d3d712a10 "addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA", file=file@entry=0x559d3d712980 "/build/mariadb-10.5-onlwrq/mariadb-10.5-10.5.9+maria~bionic~ubuntu18.04.1~1610445096.8de233af815.10.5/storage/innobase/include/fut0lst.h", line=line@entry=127) at ./storage/innobase/ut/ut0dbg.cc:60
      No locals.
      #11 0x0000559d3d2f7756 in flst_read_addr (faddr=<optimized out>) at ./storage/innobase/include/fut0lst.h:127
              addr = <optimized out>
              addr = <optimized out>
      #12 flst_get_prev_addr (node=<optimized out>) at ./storage/innobase/include/fut0lst.h:153
      No locals.
      #13 trx_purge_rseg_get_next_history_log (n_pages_handled=n_pages_handled@entry=0x7fd1737fdc78) at ./storage/innobase/trx/trx0purge.cc:856
              prev_log_addr = <optimized out>
              trx_no = <optimized out>
              mtr = {m_last = 0x0, m_last_offset = 0, m_log_mode = 0, m_modifications = 0, m_made_dirty = 0, m_inside_ibuf = 0, m_trim_pages = 0, m_memo = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7fd1737fd278, prev = 0x7fd1737fd278}}, size_ = 1}, m_size = 16, m_first_block = {<ilist_node<void>> = {next = 0x7fd1737fd258, prev = 0x7fd1737fd258}, m_data = "\000l\000\004\322\177\000\000\001", '\000' <repeats 490 times>, m_used = 16}}, m_log = {m_heap = 0x0, m_list = {<ilist<mtr_buf_t::block_t, void>> = {sentinel_ = {next = 0x7fd1737fd4a8, prev = 0x7fd1737fd4a8}}, size_ = 1}, m_size = 0, m_first_block = {<ilist_node<void>> = {next = 0x7fd1737fd488, prev = 0x7fd1737fd488}, m_data = '\000' <repeats 496 times>..., m_used = 0}}, m_user_space = 0x0, m_commit_lsn = 0, m_freed_space = 0x0, m_freed_pages = 0x0}
              undo_page = <optimized out>
              log_hdr = <optimized out>
              empty = <optimized out>
              purge = <optimized out>
      

      The second process has:

      #6  0x000055a9f78373fe in ut_dbg_assertion_failed (expr=expr@entry=0x55a9f83a20e8 "btr_pcur_is_on_user_rec(&pcur)", file=file@entry=0x55a9f83a1848 "/build/mariadb-10.5-onlwrq/mariadb-10.5-10.5.9+maria~bionic~ubuntu18.04.1~1610445096.8de233af815.10.5/storage/innobase/dict/dict0load.cc", line=line@entry=1816) at ./storage/innobase/ut/ut0dbg.cc:60
      No locals.
      #7  0x000055a9f800d623 in dict_load_columns (heap=0x55a9fc21f518, table=0x55a9fc20f480) at ./storage/innobase/dict/dict0load.cc:1816
              err_msg = <optimized out>
              name = 0x0
              nth_v_col = 18446744073709551615
              sys_columns = <optimized out>
      

      The failure happened originally on 10.5.8. I created new builds based on latest 10.5 HEAD with debug symbols etc to ensure that this bug is in fact not fixed in git yet, but exists and if 10.5.9 was released today, it would still fail like this. The attached README.txt includes notes on how I created the debug build with symbols and installed it etc procedure.

      Attachments

        1. gdb-63031.log
          167 kB
        2. gdb-6520.log
          41 kB
        3. journald-mariadb.txt
          176 kB
        4. README.txt
          3 kB

        Issue Links

          Activity

            otto Does it work if you do Galera rolling upgrade 10.1->10.2->10.3->10.4->10.5 so not directly 10.1->10.5 ?. To further analyze it would be nice to have mariabackup logs.

            jplindst Jan Lindström (Inactive) added a comment - otto Does it work if you do Galera rolling upgrade 10.1->10.2->10.3->10.4->10.5 so not directly 10.1->10.5 ?. To further analyze it would be nice to have mariabackup logs.

            marko It seems to me that the SST join process never runs `mysql_upgrade` and thus InnoDB fails on assertion error. When the IST runs there are also `Table mysql/innodb_index_stats has length mismatch in the column name table_name.` warnings right up until the upgrade starts:

            /etc/mysql/debian-start[7087]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
            /etc/mysql/debian-start[7087]: Looking for 'mysql' as: /usr/bin/mysql
            /etc/mysql/debian-start[7087]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
            /etc/mysql/debian-start[7087]: Phase 1/7: Checking and upgrading mysql database
            /etc/mysql/debian-start[7087]: Running 'mysqlcheck' with connection arguments: --port='3306' --
            socket='/var/run/mysqld/mysqld.sock'
            /etc/mysql/debian-start[7087]: # Connecting to localhost...
            /etc/mysql/debian-start[7087]: # Disconnecting from localhost...
            /etc/mysql/debian-start[7087]: Processing databases
            

            jplindst Trying an 10.1->10.2->10.3->10.4->10.5 upgrade would be waste of time. I think we can just conclude that SST is not a supported upgrade method, but with IST even the 10.1 -> 10.5 upgrade works (as the result of this experience was).

            otto Otto Kekäläinen added a comment - marko It seems to me that the SST join process never runs `mysql_upgrade` and thus InnoDB fails on assertion error. When the IST runs there are also `Table mysql/innodb_index_stats has length mismatch in the column name table_name.` warnings right up until the upgrade starts: /etc/mysql/debian-start[7087]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored /etc/mysql/debian-start[7087]: Looking for 'mysql' as: /usr/bin/mysql /etc/mysql/debian-start[7087]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck /etc/mysql/debian-start[7087]: Phase 1/7: Checking and upgrading mysql database /etc/mysql/debian-start[7087]: Running 'mysqlcheck' with connection arguments: --port='3306' -- socket='/var/run/mysqld/mysqld.sock' /etc/mysql/debian-start[7087]: # Connecting to localhost... /etc/mysql/debian-start[7087]: # Disconnecting from localhost... /etc/mysql/debian-start[7087]: Processing databases jplindst Trying an 10.1->10.2->10.3->10.4->10.5 upgrade would be waste of time. I think we can just conclude that SST is not a supported upgrade method, but with IST even the 10.1 -> 10.5 upgrade works (as the result of this experience was).

            In my opinion, to close this bug we should either

            1) Documentation: Emit some warning / fatal error that a SST from a major older version is not possible so that the process stops with a sensible error message and does not proceed to crashing InnoDB (and cause "false" alarms and people doing gdb traces in vain).

            OR

            2) Upgrade proofing: Alternatively make the Galera join process detect that the SST is from an older version and trigger a mariadb-upgrade run before InnoDB starts.

            otto Otto Kekäläinen added a comment - In my opinion, to close this bug we should either 1) Documentation: Emit some warning / fatal error that a SST from a major older version is not possible so that the process stops with a sensible error message and does not proceed to crashing InnoDB (and cause "false" alarms and people doing gdb traces in vain). OR 2) Upgrade proofing: Alternatively make the Galera join process detect that the SST is from an older version and trigger a mariadb-upgrade run before InnoDB starts.

            The supported method to do upgrades is documented e.g. here https://mariadb.com/kb/en/upgrading-from-mariadb-101-to-mariadb-102-with-galera-cluster/ . SST is not a upgrade method because it just copies database state from donor to joiner. This database state copy means all tables including mysql database. If tables are not compatible for server version on joiner there is not much we could do.

            jplindst Jan Lindström (Inactive) added a comment - The supported method to do upgrades is documented e.g. here https://mariadb.com/kb/en/upgrading-from-mariadb-101-to-mariadb-102-with-galera-cluster/ . SST is not a upgrade method because it just copies database state from donor to joiner. This database state copy means all tables including mysql database. If tables are not compatible for server version on joiner there is not much we could do.
            yecril71pl Christopher Yeleighton added a comment - - edited

            2021-02-13 14:49:02 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=452120869,452120869
            2021-02-13 14:49:03 0 [Note] InnoDB: 128 rollback segments are active.
            2021-02-13 14:49:03 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
            2021-02-13 14:49:03 0 [Note] InnoDB: Creating shared tablespace for temporary tables
            2021-02-13 14:49:03 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
            2021-02-13 14:49:03 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
            2021-02-13 14:49:03 0 [Note] InnoDB: 10.5.8 started; log sequence number 452125597; transaction id 340175
            2021-02-13 14:49:03 0x7fffdaffd640 InnoDB: Assertion failure in file /home/abuild/rpmbuild/BUILD/mariadb-10.5.8/storage/innobase/include/fut0lst.h line 127
            InnoDB: Failing assertion: addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA

            Also, assertions are not the right way to deal with inconsistent data. The purpose of assertions is to catch logic errors in the development phase.

            yecril71pl Christopher Yeleighton added a comment - - edited 2021-02-13 14:49:02 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=452120869,452120869 2021-02-13 14:49:03 0 [Note] InnoDB: 128 rollback segments are active. 2021-02-13 14:49:03 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1" 2021-02-13 14:49:03 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2021-02-13 14:49:03 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 2021-02-13 14:49:03 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 2021-02-13 14:49:03 0 [Note] InnoDB: 10.5.8 started; log sequence number 452125597; transaction id 340175 2021-02-13 14:49:03 0x7fffdaffd640 InnoDB: Assertion failure in file /home/abuild/rpmbuild/BUILD/mariadb-10.5.8/storage/innobase/include/fut0lst.h line 127 InnoDB: Failing assertion: addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA Also, assertions are not the right way to deal with inconsistent data. The purpose of assertions is to catch logic errors in the development phase.

            People

              jplindst Jan Lindström (Inactive)
              otto Otto Kekäläinen
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.