[MDEV-29321] Percona XtraDB 5.7 can't be upgrade to MariaDB 10.6 or above - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.6.8, 10.6
Fix Version/s: 10.6.10, 10.7.6, 10.8.5, 10.9.3, 10.10.2
Component/s: Galera, Storage Engine - InnoDB
Labels:
- Buster
- Debian
- innodb

Description

If one first installs Percona XtraDB 5.7 and after that tries to upgrade to MariaDB 10.6 server SigSEGV with signal 11. Backtrace for the crash is:

#0  trx_rseg_t::get (err=0x7ffeecd0122c, mtr=0x7ffeecd01840, this=0x5576091a1100 <trx_sys+16768>) at ./storage/innobase/trx/trx0rseg.cc:293

#1  trx_rseg_update_wsrep_checkpoint (xid=<optimized out>, mtr=0x7ffeecd01840) at ./storage/innobase/trx/trx0rseg.cc:155

#2  0x0000557608821946 in trx_rseg_array_init () at ./storage/innobase/trx/trx0rseg.cc:638

#3  0x0000557608827d29 in trx_lists_init_at_db_start () at ./storage/innobase/trx/trx0trx.cc:718

#4  0x00005576080f7cff in srv_start (create_new_db=<optimized out>) at ./storage/innobase/srv/srv0start.cc:1470

#5  0x0000557608718611 in innodb_init (p=<optimized out>) at ./storage/innobase/handler/ha_innodb.cc:4317

#6  0x000055760845a8b4 in ha_initialize_handlerton (plugin=0x55760b43cb20) at ./sql/handler.cc:659

#7  0x000055760824712b in plugin_initialize (tmp_root=tmp_root@entry=0x7ffeecd072e0, plugin=plugin@entry=0x55760b43cb20, argc=argc@entry=0x55760922c7a0 <remaining_argc>, argv=argv@entry=0x55760b404930,

    options_only=options_only@entry=false) at ./sql/sql_plugin.cc:1465

#8  0x0000557608248423 in plugin_init (argc=argc@entry=0x55760922c7a0 <remaining_argc>, argv=<optimized out>, flags=0) at ./sql/sql_plugin.cc:1758

#9  0x00005576081711cd in init_server_components () at ./sql/mysqld.cc:5091

#10 0x0000557608176d27 in mysqld_main (argc=<optimized out>, argv=<optimized out>) at ./sql/mysqld.cc:5708

#11 0x00007fa7d6ddc09b in __libc_start_main (main=0x55760811dd90 <main(int, char**)>, argc=2, argv=0x7ffeecd09d48, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,

    stack_end=0x7ffeecd09d38) at ../csu/libc-start.c:308

#12 0x000055760816b21a in _start () at ./sql/mysqld.cc:4521

This is probably something to WSRep (as in backtrace) but when runnign Percona XtraDB there is some additional WSRep plugin which is not available when running plain MariaDB.

/usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --wsrep-provider=/usr/lib/galera3/libgalera_smm.so --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1

This bug has happen between current release and release before that. To test this it's easiest to use Podman or Docker Debian 10 (Buster) image and run script

sed -i "s/101/0/g" -i /usr/sbin/policy-rc.d

echo -e '#!/bin/sh\necho "N 5"' > /sbin/runlevel

chmod +x /sbin/runlevel

echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections

apt-get update

apt-get install --no-install-recommends --yes gpg gpg-agent dirmngr ca-certificates

apt-key adv --recv-keys --keyserver hkps://keyserver.ubuntu.com:443 9334A25F8507EFA5

echo "deb https://repo.percona.com/apt/ buster main" > /etc/apt/sources.list.d/mysql.list

apt-get update

apt-get install -y percona-xtradb-cluster-full-57 percona-xtrabackup-24 percona-toolkit pmm2-client

apt-get install apt-transport-https curl

curl -o /etc/apt/trusted.gpg.d/mariadb_release_signing_key.asc 'https://mariadb.org/mariadb_release_signing_key.asc'

sh -c "echo 'deb https://mirrors.xtom.ee/mariadb/repo/10.6/debian buster main' >>/etc/apt/sources.list"

apt-get update

apt-get install mariadb-server

Server crashes event with `--help`

root@0912b3e1389f:/build# mariadbd --user=root --help --verbose

2022-08-18  7:48:44 0 [Note] InnoDB: Compressed tables use zlib 1.2.11

2022-08-18  7:48:44 0 [Note] InnoDB: Number of pools: 1

2022-08-18  7:48:44 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions

2022-08-18  7:48:44 0 [Note] mariadbd: O_TMPFILE is not supported on /tmp (disabling future attempts)

2022-08-18  7:48:44 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728

2022-08-18  7:48:44 0 [Note] InnoDB: Completed initialization of buffer pool

220818  7:48:44 [ERROR] mysqld got signal 11 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

Server version: 10.6.10-MariaDB-1:10.6.10+maria~deb10

key_buffer_size=134217728

read_buffer_size=131072

max_used_connections=0

max_threads=153

thread_count=0

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467959 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x0 thread_stack 0x49000

Printing to addr2line failed

mariadbd(my_print_stacktrace+0x2e)[0x562d6424bc0e]

mariadbd(handle_fatal_signal+0x485)[0x562d63d31215]

/lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7fef84314730]

mariadbd(+0xd724e3)[0x562d640f44e3]

mariadbd(+0xd793c6)[0x562d640fb3c6]

mariadbd(+0xd7f7a9)[0x562d641017a9]

mariadbd(+0x64f95f)[0x562d639d195f]

mariadbd(+0xc70371)[0x562d63ff2371]

mariadbd(_Z24ha_initialize_handlertonP13st_plugin_int+0x84)[0x562d63d345b4]

mariadbd(+0x79ee2b)[0x562d63b20e2b]

mariadbd(_Z11plugin_initPiPPci+0xaac)[0x562d63b2218c]

mariadbd(+0x6c8ecd)[0x562d63a4aecd]

mariadbd(_Z11mysqld_mainiPPc+0x3f7)[0x562d63a50a27]

/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb)[0x7fef83e5809b]

mariadbd(_start+0x2a)[0x562d63a44f1a]

The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains

information that should help you find out what is causing the crash.

Writing a core file...

Working directory at /var/lib/mysql

Resource Limits:

Limit                     Soft Limit           Hard Limit           Units

Max cpu time              unlimited            unlimited            seconds

Max file size             unlimited            unlimited            bytes

Max data size             unlimited            unlimited            bytes

Max stack size            8388608              unlimited            bytes

Max core file size        unlimited            unlimited            bytes

Max resident set          unlimited            unlimited            bytes

Max processes             unlimited            unlimited            processes

Max open files            1048576              1048576              files

Max locked memory         65536                65536                bytes

Max address space         unlimited            unlimited            bytes

Max file locks            unlimited            unlimited            locks

Max pending signals       11539                11539                signals

Max msgqueue size         819200               819200               bytes

Max nice priority         0                    0

Max realtime priority     0                    0

Max realtime timeout      unlimited            unlimited            us

Core pattern: |/bin/false

Kernel version: Linux version 5.14.21-150400.22-default (geeko@buildhost) (gcc (SUSE Linux) 7.5.0, GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.37.20211103-150100.7.29) #1 SMP PREEMPT_DYNAMIC Wed May 11 06:57:18 UTC 2022 (49db222)

Segmentation fault (core dumped)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

mysqld-perconaxtradb57.log.xz
6 kB
2022-08-18 10:32
varlibmysql-perconaxtradb57.tar.gz
615 kB
2022-08-18 10:32

Issue Links

is caused by

MDEV-13542 Crashing on a corrupted page is unhelpful

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2022-08-18 07:11

The crash occurs in trx_rseg_format_upgrade(). I don’t think that this code has substantially changed since 10.5 (~~MDEV-12353~~), or 10.4 (~~MDEV-17138~~ introduced a MLOG_MEMSET record), or 10.3 (~~MDEV-12288~~).

illuusio, what do you mean by "between current release and release before that"? What would happen with the recently released 10.6.9? Does the server start up if Galera is not being used? (Are you sure that you are using the correct version of Galera? Starting with MariaDB Server 10.4.2, libgalera4 is supposed to be used.

Marko Mäkelä added a comment - 2022-08-18 07:11 The crash occurs in trx_rseg_format_upgrade() . I don’t think that this code has substantially changed since 10.5 ( MDEV-12353 ), or 10.4 ( MDEV-17138 introduced a MLOG_MEMSET record), or 10.3 ( MDEV-12288 ). illuusio , what do you mean by "between current release and release before that"? What would happen with the recently released 10.6.9? Does the server start up if Galera is not being used? (Are you sure that you are using the correct version of Galera? Starting with MariaDB Server 10.4.2 , libgalera4 is supposed to be used.

Tuukka Pasanen added a comment - 2022-08-18 07:29

This test if for Debian 10 (Buster) and Percona XtraDB 5.7 requires Galera 3.x and MariaDB 10.6 requires Galera 4.x (which is available on Backports).

I'll do test build with https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/f4d06db1293c3ec340ae2be6347971274ba38160 (Debian 10.6.8 commit) in Salsa-CI when it was working. It could be also something that Percona has made during this time.

This run is 2 months ago https://salsa.debian.org/mariadb-team/mariadb-server/-/jobs/2798656 and it works as expected.

Tuukka Pasanen added a comment - 2022-08-18 07:29 This test if for Debian 10 (Buster) and Percona XtraDB 5.7 requires Galera 3.x and MariaDB 10.6 requires Galera 4.x (which is available on Backports). I'll do test build with https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/f4d06db1293c3ec340ae2be6347971274ba38160 (Debian 10.6.8 commit) in Salsa-CI when it was working. It could be also something that Percona has made during this time. This run is 2 months ago https://salsa.debian.org/mariadb-team/mariadb-server/-/jobs/2798656 and it works as expected.

Marko Mäkelä added a comment - 2022-08-18 09:15

I do not see any message like InnoDB: Upgrading redo log:. There should have been one, unless the ib_logfile* were deleted before MariaDB was started.

Please make sure that the log files are not deleted. Potentially, the XtraDB server hung on shutdown and was forcibly killed. If the log files are deleted (or innodb_force_recovery=6 is used), basically all bets are off.

Marko Mäkelä added a comment - 2022-08-18 09:15 I do not see any message like InnoDB: Upgrading redo log: . There should have been one, unless the ib_logfile* were deleted before MariaDB was started. Please make sure that the log files are not deleted. Potentially, the XtraDB server hung on shutdown and was forcibly killed. If the log files are deleted (or innodb_force_recovery=6 is used), basically all bets are off.

Tuukka Pasanen added a comment - 2022-08-18 10:33

I attached mysql dir before MariaDB is upgraded and `mysql.log` fom Percona XtraDB launch

Tuukka Pasanen added a comment - 2022-08-18 10:33 I attached mysql dir before MariaDB is upgraded and `mysql.log` fom Percona XtraDB launch

Marko Mäkelä added a comment - 2022-08-18 13:01

I can repeat this on the varlibmysql-perconaxtradb57.tar.gz (which is actually in xz format, tar xJf). A message about the redo log upgrade was not written yet, but recv_log_recover_10_4() did recognize the logs as being logically empty (clean).

This failure is related to an incompatible change that was made in MySQL 5.7 but not in MariaDB (see ~~MDEV-12288~~).

10.6 af552f2903b9764e3bc0634a0037c39d1a837f4f
#2 trx_rseg_update_wsrep_checkpoint (xid=xid@entry=0x7ffd16bf1948, mtr=mtr@entry=0x7ffd16bf1490) at /mariadb/10.6/storage/innobase/trx/trx0rseg.cc:155
155 if (buf_block_t* block= trx_sys.rseg_array[rseg_id].get(mtr, &err))
(rr) p rseg_id
$2 = 1
(rr) p trx_sys.rseg_array[1]
$3 = {space = 0x0, latch = {pfs_psi = 0x0, lock = {writer = {lock = std::atomic<unsigned int> = { 0 }}, readers = std::atomic<unsigned int> = { 0 }}}, page_no = 0, history_size = 0,
ref = std::atomic<unsigned int> = { 0 }, curr_size = 0, undo_list = {count = 0, start = 0x0, end = 0x0, node = NULL, init = 0}, undo_cached = {count = 0, start = 0x0, end = 0x0, node = NULL, init = 0},
last_page_no = 0, last_commit_and_offset = 0}

We skipped this rollback segment in trx_rseg_array_init() because the page number was FIL_NULL.

Some logic was changed in ~~MDEV-13542~~. The fix is simple:

diff --git a/storage/innobase/trx/trx0rseg.cc b/storage/innobase/trx/trx0rseg.cc

index 34ec21d6013..760c4e707ce 100644

--- a/storage/innobase/trx/trx0rseg.cc

+++ b/storage/innobase/trx/trx0rseg.cc

@@ -290,6 +290,11 @@ bool trx_rseg_read_wsrep_checkpoint(XID& xid)

 buf_block_t *trx_rseg_t::get(mtr_t *mtr, dberr_t *err) const

+  if (!space)

+  {

+    if (err) *err= DB_TABLESPACE_NOT_FOUND;

+    return nullptr;

+  }

   return buf_page_get_gen(page_id(), 0, RW_X_LATCH, nullptr,

                           BUF_GET, mtr, err);

@@ -435,6 +440,8 @@ static dberr_t trx_undo_lists_init(trx_rseg_t *rseg, trx_id_t &max_trx_id,

 static dberr_t trx_rseg_mem_restore(trx_rseg_t *rseg, trx_id_t &max_trx_id,

                                     mtr_t *mtr)

+  if (!rseg->space)

+    return DB_TABLESPACE_NOT_FOUND;

   dberr_t err;

   const buf_block_t *rseg_hdr=

     buf_page_get_gen(rseg->page_id(), 0, RW_S_LATCH, nullptr, BUF_GET, mtr,

With that, the server starts up:

10.6 af552f2903b9764e3bc0634a0037c39d1a837f4f with patch
2022-08-18 15:58:01 0 [Note] InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!!
2022-08-18 15:58:01 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2022-08-18 15:58:01 0 [Note] InnoDB: Using transactional memory
2022-08-18 15:58:01 0 [Note] InnoDB: Number of pools: 1
2022-08-18 15:58:01 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2022-08-18 15:58:01 0 [Note] InnoDB: Using liburing
2022-08-18 15:58:01 0 [Note] InnoDB: Initializing buffer pool, total size = 3221225472, chunk size = 134217728
2022-08-18 15:58:01 0 [Note] InnoDB: Completed initialization of buffer pool
2022-08-18 15:58:01 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed
2022-08-18 15:58:01 0 [Note] InnoDB: Upgrading redo log: 4294967296 bytes; LSN=2757622
2022-08-18 15:58:01 0 [Note] InnoDB: Starting to delete and rewrite log file.

Marko Mäkelä added a comment - 2022-08-18 13:01 I can repeat this on the varlibmysql-perconaxtradb57.tar.gz (which is actually in xz format, tar xJf ). A message about the redo log upgrade was not written yet, but recv_log_recover_10_4() did recognize the logs as being logically empty (clean). This failure is related to an incompatible change that was made in MySQL 5.7 but not in MariaDB (see MDEV-12288 ). 10.6 af552f2903b9764e3bc0634a0037c39d1a837f4f #2 trx_rseg_update_wsrep_checkpoint (xid=xid@entry=0x7ffd16bf1948, mtr=mtr@entry=0x7ffd16bf1490) at /mariadb/10.6/storage/innobase/trx/trx0rseg.cc:155 155 if (buf_block_t* block= trx_sys.rseg_array[rseg_id].get(mtr, &err)) (rr) p rseg_id $2 = 1 (rr) p trx_sys.rseg_array[1] $3 = {space = 0x0, latch = {pfs_psi = 0x0, lock = {writer = {lock = std::atomic<unsigned int> = { 0 }}, readers = std::atomic<unsigned int> = { 0 }}}, page_no = 0, history_size = 0, ref = std::atomic<unsigned int> = { 0 }, curr_size = 0, undo_list = {count = 0, start = 0x0, end = 0x0, node = NULL, init = 0}, undo_cached = {count = 0, start = 0x0, end = 0x0, node = NULL, init = 0}, last_page_no = 0, last_commit_and_offset = 0} We skipped this rollback segment in trx_rseg_array_init() because the page number was FIL_NULL . Some logic was changed in MDEV-13542 . The fix is simple: diff --git a/storage/innobase/trx/trx0rseg.cc b/storage/innobase/trx/trx0rseg.cc index 34ec21d6013..760c4e707ce 100644 --- a/storage/innobase/trx/trx0rseg.cc +++ b/storage/innobase/trx/trx0rseg.cc @@ -290,6 +290,11 @@ bool trx_rseg_read_wsrep_checkpoint(XID& xid) buf_block_t *trx_rseg_t::get(mtr_t *mtr, dberr_t *err) const { + if (!space) + { + if (err) *err= DB_TABLESPACE_NOT_FOUND; + return nullptr; + } return buf_page_get_gen(page_id(), 0, RW_X_LATCH, nullptr, BUF_GET, mtr, err); } @@ -435,6 +440,8 @@ static dberr_t trx_undo_lists_init(trx_rseg_t *rseg, trx_id_t &max_trx_id, static dberr_t trx_rseg_mem_restore(trx_rseg_t *rseg, trx_id_t &max_trx_id, mtr_t *mtr) { + if (!rseg->space) + return DB_TABLESPACE_NOT_FOUND; dberr_t err; const buf_block_t *rseg_hdr= buf_page_get_gen(rseg->page_id(), 0, RW_S_LATCH, nullptr, BUF_GET, mtr, With that, the server starts up: 10.6 af552f2903b9764e3bc0634a0037c39d1a837f4f with patch 2022-08-18 15:58:01 0 [Note] InnoDB: !!!!!!!! UNIV_DEBUG switched on !!!!!!!!! 2022-08-18 15:58:01 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2022-08-18 15:58:01 0 [Note] InnoDB: Using transactional memory 2022-08-18 15:58:01 0 [Note] InnoDB: Number of pools: 1 2022-08-18 15:58:01 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2022-08-18 15:58:01 0 [Note] InnoDB: Using liburing 2022-08-18 15:58:01 0 [Note] InnoDB: Initializing buffer pool, total size = 3221225472, chunk size = 134217728 2022-08-18 15:58:01 0 [Note] InnoDB: Completed initialization of buffer pool 2022-08-18 15:58:01 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed 2022-08-18 15:58:01 0 [Note] InnoDB: Upgrading redo log: 4294967296 bytes; LSN=2757622 2022-08-18 15:58:01 0 [Note] InnoDB: Starting to delete and rewrite log file.

Tuukka Pasanen added a comment - 2022-08-19 07:53

This is fixed now: https://salsa.debian.org/illuusio/mariadb-server/-/jobs/3122466

Tuukka Pasanen added a comment - 2022-08-19 07:53 This is fixed now: https://salsa.debian.org/illuusio/mariadb-server/-/jobs/3122466

MariaDB Server

Percona XtraDB 5.7 can't be upgrade to MariaDB 10.6 or above

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration