[MDEV-31443] assert [FATAL] InnoDB: Unable to find charset-collation for 254 after upgrade from 10.11.4 to 11.0 Created: 2023-06-09  Updated: 2023-09-08  Resolved: 2023-07-05

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 11.0.2
Fix Version/s: 11.0.3, 11.1.2, 11.2.1

Type: Bug Priority: Critical
Reporter: Erlandas Assignee: Marko Mäkelä
Resolution: Fixed Votes: 1
Labels: assertion, crash
Environment:

Ubuntu 22.04


Attachments: File data.tar.xz     File docker-mariadb@debug.service.log.gz     File gdb_output.log     File t_table.sql    
Issue Links:
Blocks
is blocked by MDEV-32044 Mariadb crash after upgrading to 11.0... Closed
Duplicate
duplicates MDEV-31666 Assertion `recv_sys.recovery_on' fail... Closed
is duplicated by MDEV-31579 Mariadb fails to start following auto... Closed
Problem/Incident
is caused by MDEV-29694 Remove the InnoDB change buffer Closed

 Description   

Once the I have upgraded from 10.11.4 version - MariaDB did not start.

Crash dump is not showing symbols (dbgsym package is installed):

Reading symbols from /usr/sbin/mariadbd...
(No debugging symbols found in /usr/sbin/mariadbd)
(gdb)

Error log is showing:

2023-06-09 9:23:25 0 [Note] InnoDB: Number of transaction pools: 1
2023-06-09 9:23:25 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2023-06-09 9:23:25 0 [Note] InnoDB: Initializing buffer pool, total size = 4.000GiB, chunk size = 64.000MiB
2023-06-09 9:23:25 0 [Note] InnoDB: Completed initialization of buffer pool
2023-06-09 9:23:25 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
2023-06-09 9:23:25 0 [Note] InnoDB: Upgrading the change buffer
2023-06-09 9:23:25 0 [ERROR] [FATAL] InnoDB: Unable to find charset-collation for 254
230609 9:23:25 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 11.0.2-MariaDB-1:11.0.2+maria~ubu2204 source revision: 0005f2f06c8e1aea4915887decad67885108a929
key_buffer_size=33554432
read_buffer_size=131072
max_used_connections=0
max_threads=65537
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 25202972 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x49000
/usr/sbin/mariadbd(my_print_stacktrace+0x32)[0x55712e8b1a62]
/usr/sbin/mariadbd(handle_fatal_signal+0x488)[0x55712e384e28]
libc_sigaction.c:0(__restore_rt)[0x7feb2b458520]
nptl/pthread_kill.c:44(__pthread_kill_implementation)[0x7feb2b4aca7c]
posix/raise.c:27(__GI_raise)[0x7feb2b458476]
stdlib/abort.c:81(__GI_abort)[0x7feb2b43e7f3]
/usr/sbin/mariadbd(+0x69753b)[0x55712df9953b]
/usr/sbin/mariadbd(+0x6812f7)[0x55712df832f7]
/usr/sbin/mariadbd(+0xdf703f)[0x55712e6f903f]
/usr/sbin/mariadbd(+0xdd3c92)[0x55712e6d5c92]
/usr/sbin/mariadbd(+0x6b2ebc)[0x55712dfb4ebc]
/usr/sbin/mariadbd(+0x691449)[0x55712df93449]
/usr/sbin/mariadbd(+0xd7aad9)[0x55712e67cad9]
/usr/sbin/mariadbd(_Z24ha_initialize_handlertonP13st_plugin_int+0x86)[0x55712e3880b6]
/usr/sbin/mariadbd(+0x83d686)[0x55712e13f686]
/usr/sbin/mariadbd(_Z11plugin_initPiPPci+0x91d)[0x55712e14084d]
/usr/sbin/mariadbd(+0x70bb91)[0x55712e00db91]
/usr/sbin/mariadbd(_Z11mysqld_mainiPPc+0x424)[0x55712e013324]
nptl/libc_start_call_main.h:58(__libc_start_call_main)[0x7feb2b43fd90]
csu/libc-start.c:128(call_init)[0x7feb2b43fe40]
/usr/sbin/mariadbd(_start+0x25)[0x55712e007b05]
The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
information that should help you find out what is causing the crash.
Writing a core file...
Working directory at /var/lib/mysql
Resource Limits:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 256359 256359 processes
Max open files 65535 65535 files
Max locked memory unlimited unlimited bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 256359 256359 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: |/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g – %E

Kernel version: Linux version 5.15.0-73-generic (buildd@bos03-amd64-060) (gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38 ) #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023



 Comments   
Comment by Erlandas [ 2023-06-09 ]

I was unable to start MariaDB on multiple attempts and had to remove 11.0.2 version and install 10.11.4

Comment by Daniel Black [ 2023-06-13 ]
dump function notes
/usr/sbin/mariadbd(+0x69753b)[0x55712df9953b] ib::fatal::~fatal() [clone .cold]  
/usr/sbin/mariadbd(+0x6812f7)[0x55712df832f7] cmp_data(unsigned long, unsigned long, bool, unsigned char const*, unsigned long, unsigned char const*, unsigned long) missing charset is second argument here, unmodified by function
/usr/sbin/mariadbd(+0xdf703f)[0x55712e6f903f] cmp_dtuple_rec_with_match_low(dtuple_t const*, unsigned char const*, dict_index_t const*, unsigned short const*, unsigned long, unsigned long*)
/usr/sbin/mariadbd(+0xdd3c92)[0x55712e6d5c92] page_cur_search_with_match(dtuple_t const*, page_cur_mode_t, unsigned long*, unsigned long*, page_cur_t*, rtr_info*)  
/usr/sbin/mariadbd(+0x6b2ebc)[0x55712dfb4ebc] ibuf_upgrade()  
/usr/sbin/mariadbd(+0x691449)[0x55712df93449] srv_start(bool) [clone .cold]  
/usr/sbin/mariadbd(+0xd7aad9)[0x55712e67cad9] innodb_init(void*)  
Comment by Marko Mäkelä [ 2023-06-26 ]

The error message mentions prtype=254. I think that we may be missing some adjustment for data that had been originally written with something older than MySQL 4.1.2, before proper support for character sets was implemented. Originally, the prtype (precise type) did not include any charset-collation code. I think that we need some adjustment like this code in dict_load_column_low():

	if (dtype_get_charset_coll(prtype) == 0
	    && dtype_is_string_type(mtype)) {
		/* The table was created with < 4.1.2. */
 
		if (dtype_is_binary_string_type(mtype, prtype)) {
			/* Use the binary collation for
			string columns of binary type. */
 
			prtype = dtype_form_prtype(
				prtype,
				DATA_MYSQL_BINARY_CHARSET_COLL);
		} else {
			/* Use the default charset for
			other than binary columns. */
 
			prtype = dtype_form_prtype(
				prtype,
				default_charset_info->number);
		}
	}

To work around this problem, Erland22 could shut down 10.11 after executing the following:

SET GLOBAL innodb_fast_shutdown=0;

That should complete the change buffer merge, and then ibuf_upgrade() in 11.0 would have nothing to do. But, that would also destroy any evidence. We’d need a copy of an old enough data directory in order to reproduce and fix this bug.

Comment by Joshua Lückers [ 2023-06-26 ]

I'm experiencing the same issue on my work machine. I would love to provide a data directory but I don't feel comfortable sharing all the databases I have (I have a lot of them locally). Is there a way to track down what database/data directory is causing this issue?

P.s: I downgraded to an older version so everything works again. Is there e.g a query I can use to find the culprit without needed to "break" anything again?

Comment by Marko Mäkelä [ 2023-06-26 ]

JoshuaL, in a debug build of the server you can set a startup option innodb_change_buffer_dump=ON to have the contents of the change buffer dumped to the server error log. That should allow some tablespaces with the problematic metadata to be identified.

I do not know if there are ready-made packages for debug builds. (The -debuginfo or -dbgsym packages typically contain debugging information for release builds.)

I have asked our testers if it would be possible to load some data into MySQL 4.0 so that I can reproduce the failure and test a fix.

Comment by Christian Hesse [ 2023-06-26 ]

I am having the same issue, with MariaDB packages from Arch Linux testing repository, mariadb 11.0.2-1. For me it is another prtype, though:

2023-06-25 8:02:07 0 [ERROR] [FATAL] InnoDB: Unable to find charset-collation for 271

The data originates from a very old MySQL-Installation, but I did dump in MySQL 5.1.73 and imported to MariaDB 10.1.36 in 2018.

Comment by Erlandas [ 2023-06-26 ]

Thank you Marko,

I would prefer to pull out the problem instead of destroying it.

Here is my config (see the end of it where I have added the innodb_change_buffer_dump=ON

Should MariaDB 10.11.4 restart now make a dump of the problem into log file? Please confirm and I'll make a restart.

Just a note: _ innodb_fast_shutdown_ setting is being used since 10.6 version or so, before that (and before switching to MariaDB 10.1 from MySQL 5.6 I believe - such switch was not being used).

# this is only for the mysqld standalone daemon
[mysqld]
 
#
# * Basic Settings
#
 
#user                    = mysql
#pid-file                = /run/mysqld/mysqld.pid
#basedir                 = /usr
#datadir                 = /var/lib/mysql
#tmpdir                  = /tmp
 
 
 
 
user                    = mysql
pid-file                = /run/mysqld/mysqld.pid
basedir                 = /usr
datadir                 = /var/lib/mysql
 
 
lc-messages-dir         = /usr/share/mysql
lc-messages             = en_US
skip-external-locking
 
#Replication Settings
log-bin
log_bin_compress = 1
slave_compressed_protocol = 1
server_id=22
#Temporarily as read only while data served from old server
##read_only=1
# expire_logs_days is deprecated. Instead, use binlog_expire_logs_seconds
binlog_expire_logs_seconds=43200
gtid_strict_mode=1
large-pages
 
gtid_domain_id=22
log-basename=master22
binlog_format=mixed
 
replicate_do_db=cms
 
performance-schema=0
sql_mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION"
open_files_limit = 240000
log_error=/var/lib/mysql/errors.log
log_warnings = 2
log_slow_slave_statements = 1
tmpdir = /var/lib/mysql/tmp
default-storage-engine=InnoDB
innodb_file_per_table=1
innodb_defragment=1
innodb_flush_method=O_DIRECT
innodb_flush_neighbors=0
innodb_max_dirty_pages_pct=85
innodb_buffer_pool_size=72G
innodb_change_buffering=all
innodb_log_file_size=16G
innodb_log_buffer_size =256M
innodb_io_capacity = 80000
innodb_read_io_threads = 32
innodb_write_io_threads = 32
max_connections=2000
key_buffer_size = 10M
max_allowed_packet=512M
 
wait_timeout=120
interactive_timeout=120
innodb_lock_wait_timeout=60
lock_wait_timeout = 50
read_buffer_size = 128K
table_definition_cache = 6400
table_open_cache = 12000
max_heap_table_size = 128M
tmp_table_size = 128M
join_buffer_size = 256K
sort_buffer_size = 256K
thread_handling=pool-of-threads
 
skip-name-resolve=1
query_cache_type = 0
query_cache_size = 0
innodb_buffer_pool_dump_pct=75
innodb_buffer_pool_dump_at_shutdown=ON
innodb_buffer_pool_load_at_startup=ON
innodb_flush_log_at_trx_commit =1
sync_binlog=1
innodb_change_buffer_dump=ON
 
[mysqldump]
max_allowed_packet=512M

Comment by Marko Mäkelä [ 2023-06-26 ]

Erland22, before you restart with innodb_change_buffer_dump=ON, please make sure that innodb_fast_shutdown=1 (not 0) is in effect. If the server fails to start up due to an unknown option innodb_change_buffer_dump, then it is not a debug version.

eworm, if you loaded an SQL dump into a freshly created MariaDB 10.1.36 database, that would simplify our testing. Do you happen to have a copy of the CREATE TABLE statements that you could share?

Comment by Christian Hesse [ 2023-06-26 ]

This is a large database with lots of dbs and about 4000 tables... Will have to identify the effected table first.

Comment by Marko Mäkelä [ 2023-06-26 ]

The .frm files should embed a MySQL or MariaDB version number of the server where the table was last created or rebuilt (not counting rebuild by TRUNCATE TABLE):

for i in */*.frm
do
od -Ax --endian=little -t u4 -j 51 -N 4 "$i"
done

The version number is encoded like 100908 for 10.9.8.

Comment by Christian Hesse [ 2023-06-26 ]

I had one table from 10.1.36 and one from 10.4.8... Rebuilt both with ALTER TABLE ... FORCE; - will check later if the situation changed.
All other tables are from 10.5.6 or later.

Comment by Christian Hesse [ 2023-06-26 ]

No change, still fails with the same message.

Comment by Marko Mäkelä [ 2023-06-26 ]

It is possible that the change buffer contains entries for tables that have been dropped a long time ago but were not removed from the change buffer. The change buffer makes use of lazy deletion.

I hope that someone can dump their change buffer. Here you might find a usable 10.6 debug build: https://ci.mariadb.org/36101/amd64-ubuntu-2004-debug/

Comment by Erlandas [ 2023-06-26 ]

I have the following:

MariaDB [(none)]> SHOW VARIABLES LIKE 'innodb_fast_shutdown';
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| innodb_fast_shutdown | 1     |
+----------------------+-------+
1 row in set (0.001 sec)

How do I check if I have debug version? I have installed Debug symbols package, but they are not showing up (as if that package is not installed).

Or what could I install on Ubuntu 22.04 in order to get change buffer pulled?

Comment by Erlandas [ 2023-06-26 ]

I have CREATE TABLES statements pulled how or where I could safely (not publicly) share it with you?

Comment by Christian Hesse [ 2023-06-26 ]

Running Arch here... So the Ubuntu build is of little use for me.
What options to add for a debug build with the desired capabilities? The change of build type is sufficient? Well, the package does not build with that option... :-/

Also have wanted to go with the slow shutdown here... But looks like it hangs, repeatedly (for an hour now) giving:
2023-06-26 15:55:21 0 [Note] Completing change buffer merge; 61 page reads initiated; 296 change buffer pages remain
So am I suffering any kind of change buffer corruption?

Comment by Marko Mäkelä [ 2023-06-27 ]

eworm, I use cmake -DCMAKE_BUILD_TYPE=Debug (and a few other options). Depending on the compiler and options, the GCC flag -Werror that this would imply could cause the build to fail. Does the build fail also when you use cmake -DMYSQL_MAINTAINER_MODE=WARN to disable the -Werror.

Yes, there should be progress messages output every 15 seconds if the upgrade of the change buffer is taking a long time, and the number of change buffer pages to process should steadily decrease. Can you please attach a debugger to the hung server process and produce full stack traces of the hang?

MDEV-29694 describes how we tested the upgrade logic. With MDEV-30009 fresh in my mind, I tried to pay special attention to scenarios where the change buffer bitmaps are corrupted. It looks like I have missed something. Hopefully the stack traces and a dump of the change buffer contents will make it possible to reproduce and fix this. After all, the database was originally initialized using MariaDB Server 10.1.36, which is not that old.

Comment by Christian Hesse [ 2023-06-27 ]

Ok, here we go... Built mariadb 10.9.7 with cmake -DCMAKE_BUILD_TYPE=Debug -DMYSQL_MAINTAINER_MODE=WARN and dumped the change buffer. Please see uploaded file docker-mariadb@debug.service.log.gz...

Regarding the hung server process on change buffer apply... Should I open a new issue for that? I guess it is not strictly related.

Comment by Marko Mäkelä [ 2023-06-27 ]

eworm, thank you. I think that also the hang is related to the upgrade logic that was implemented in MDEV-29694.

The type information will be encoded in the 4th field of each record, like this:

 3: len 16; hex 00000001860300040000860300048000; asc                 ;;

Because the length (16) is not divisible by 6, this is not in the innodb_change_buffering=inserts format, and the first 4 bytes are a 16-bit counter and additional metadata. (16-4=12)/6=2, meaning that there is metadata for 2 columns:

860300040000 860300048000

These 6 bytes are decoded in the function dtype_new_read_for_order_and_null_size().

The character-set/collation code would be the 15 least significant bits of each field (the last 4 hexadecimal digits masked with 0x7fff). The most significant bit of those 16 bits is the NOT NULL flag. Here we have charset-collation 0 for both fields, and the second field is NOT NULL.

For other than CHAR, VARCHAR, TEXT columns, the charset-collation code should be ignored.

The prtype without the flags DATA_BINARY_TYPE (1024) and DATA_NOT_NULL (256) or the charset-collation code is in the second byte. Here it would be 3 for both fields.

If you could produce a full stack trace for the FATAL error, that would point me directly to the relevant change buffer record.

Comment by Christian Hesse [ 2023-06-27 ]

Uploaded gdb output... Hope that helps.

Comment by Marko Mäkelä [ 2023-06-29 ]

eworm, thank you, gdb_output.log is very helpful. Unfortunately, the values of many variables have been optimized out. Can you execute the following in GDB?

frame 5
p/x block.page.id_
p/x index.table.space.id
p *index.fields@index.n_fields
p index.table.space.chain.start.name

The first command should switch to the stack frame of ibuf_insert_to_index_page(). The 32 most significant bits of the page identifier are the tablespace identifier, and the least significant 32 bits are the page number. The last command should identify the file name of the table. I would like to see the table definition. You can obfuscate the name of the table and the columns to protect any confidential data. What I am mostly interested in are the PRIMARY KEY columns and any indexed columns.

Comment by Christian Hesse [ 2023-06-29 ]

And here we go... Out put is:

(gdb) p/x block.page.id_
$1 = {m_id = 0x0}
(gdb) p/x index.table.space.id
$2 = 0x7f1
(gdb) p *index.fields@index.n_fields
$3 = {{col = 0x7f5f747cbf80, name = {m_name = 0x0}, prefix_len = 7, fixed_len = 0, descending = 0}, {col = 0x7f5f747cbfa0, name = {m_name = 0x0}, prefix_len = 4, fixed_len = 4, descending = 0}}
(gdb) p index.table.space.chain.start.name
$4 = 0x7f5f4aa96798 "./xxx/t_table.ibd"

I've uploaded the table structure in t_table.sql, where all names were replaced. The table was created with MariaDB 10.5.11, checked with the method described above.

Probably worth noting: The table has columns that contain German Umlaut characters ('ä', 'ö', 'ü'). Can that cause the issues we see here?

Comment by Marko Mäkelä [ 2023-06-29 ]

Thank you. Somehow I got the block wrong, but it does not matter.

The index here points to a dummy object that has been constructed based on the contents of change buffer records, which I guess explains why the prefix_len is nonzero even though there are no column prefix indexes in your table. The second field is the PRIMARY KEY(column01), a 4-byte integer. The first field should be something with 7 bytes, but I can't figure out what it would be. Can you also dump the following:

print *(dict_col_t*)0x7f5f747cbf80
print *(dict_col_t*)0x7f5f747cbfa0

If you are starting a new process and not debugging a previous core dump, the addresses can change. In that case:

frame 5
p *index.fields[0].col
p *index.fields[1].col

The encoding of dictionary object names was changed to UTF-8 basic multilingual plane (utf8mb3) back in MySQL 5.1. That should work just fine. Sometimes users forget to declare that the connection character set encoding is UTF-8 instead of the default latin1, while passing UTF-8 data. In that case, the data dictionary would see some "double encoded" names, as in the following (assuming that you are using UTF-8 in the shell):

echo äöü|recode latin1..utf8
äöü

Each of these 6 non-ASCII characters would then be encoded in the filename-safe encoding (@ followed by 2 non-hex chars, or 4 hexadecimal digits, that is, 3 or 5 ASCII chars per source character).

Comment by Christian Hesse [ 2023-06-29 ]

(gdb) print *(dict_col_t*)0x7f5f747cbf80
$8 = {prtype = 271, mtype = 12, len = 60, mbminlen = 0, mbmaxlen = 0, ind = 0, ord_part = 0, max_prefix = 0, static DROPPED = 1023, def_val = {data = 0x0, len = 4294967294}}
(gdb) print *(dict_col_t*)0x7f5f747cbfa0
$9 = {prtype = 1283, mtype = 6, len = 4, mbminlen = 0, mbmaxlen = 0, ind = 1, ord_part = 0, max_prefix = 0, static DROPPED = 1023, def_val = {data = 0x0, len = 4294967294}}

No double encoding I have characters encoded in UTF-8 as expected. That should be fine then.

Comment by Marko Mäkelä [ 2023-06-29 ]

Thank you. I think that this should be the minimal table definition to reproduce this:

CREATE TABLE `t_table` (
  `column01` int(11) NOT NULL AUTO_INCREMENT,
  `column06` varchar(20) NOT NULL DEFAULT '',
  PRIMARY KEY (`column01`),
  KEY `column06` (`column06`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci;

Comment by Marko Mäkelä [ 2023-06-29 ]

Some sanity checking: mtype=12=DATA_VARMYSQL and prtype=271=256+15=DATA_NOT_NULL+DATA_MYSQL_TRUE_VARCHAR do match column06, which is VARCHAR in the post-MySQL-5.0.3 .frm file format and NOT NULL. The len=60 in bytes matches VARCHAR(20) with up to 3 bytes per character.

For some reason, the charset-collation part of prtype had not been set. Perhaps in some older minor versions of MariaDB there was some bug related to that. I would expect our stress test for the upgrade to have generated that kind of data.

eworm, do you happen to have complete server error logs so that you could list all MariaDB versions that have ever touched this data directory? It could help the search.

Also, in case this should depend on the data, can you try to determine the page number? Here is an alternative way of extracting it as well as the change buffer record:

p/x *(char*)0x7f5f39764004@4
frame 5
print *entry
print *entry.fields@entry.n_fields
p/x *(char*)entry.fields[0].data@entry.fields[0].len
p/x *(char*)entry.fields[1].data@entry.fields[1].len

That would help me to find the related the change buffer record. In [^docker-mariadb@debug.service.log.gz] there are many records for many pages of this tablespace 0x7f1. Maybe they all share the same metadata, but I did not try to check that yet.

Comment by Christian Hesse [ 2023-06-29 ]

Upgrade path is:

  • 10.1.36
  • 10.1.37
  • 10.3.12
  • 10.3.13
  • 10.3.14
  • 10.3.15
  • 10.3.16
  • 10.4.8
  • 10.4.11
  • 10.4.12
  • 10.4.13
  • 10.4.14
  • 10.5.5
  • 10.5.6
  • (switched wsrep_sst_method from rsync to mariabackup)
  • 10.5.9
  • 10.5.10
  • 10.5.11 (last ALTER TABLE or similar happened here)
  • 10.5.13
  • 10.9.3
  • 10.9.5
  • 10.9.7
  • 11.0.2

This is not from error log though, but from package manager log and git repository where I store the database structure.

All packages are from official Arch Linux repositories, unless we could not update due to temporary incompatibilities with our applications. I think we had some bugfix releases in production where Arch Linux was ahead with next feature release.

BTW, this is a Galera cluster... Does not matter, no?

Comment by Christian Hesse [ 2023-06-29 ]

Some of your gdb commands fail...

(gdb) p/x *(char*)0x7f5f39764004@4
$12 = {0x0, 0x0, 0x0, 0x0}
(gdb) frame 5
#5  0x0000561eefba245d in ibuf_insert_to_index_page (mtr=0x7ffd75e92d48, index=0x7f5f749f7f80, block=0x7f5f39401440, entry=0x7f5f749f7080) at /usr/src/debug/mariadb/mariadb-11.0.2/storage/innobase/ibuf/ibuf0ibuf.cc:384
384             if (page_cur_search_with_match(entry, PAGE_CUR_LE,
(gdb) print *entry
$13 = {info_bits = 0, n_fields = 2, n_fields_cmp = 2, fields = 0x7f5f749f70b0, n_v_fields = 0, v_fields = 0x0}
(gdb) print *entry.fields@entry.n_fields
A syntax error in expression, near `@entry.n_fields'.
(gdb) p/x *(char*)entry.fields[0].data@entry.fields[0].len
A syntax error in expression, near `@entry.fields[0].len'.
(gdb) p/x *(char*)entry.fields[1].data@entry.fields[1].len
A syntax error in expression, near `@entry.fields[1].len'.

Comment by Christian Hesse [ 2023-06-29 ]

Oh, some extra questions... Would an ALTER TABLES t_table FORCE; fix this?

And any idea if this can cause the hang on slow shutdown?

Comment by Marko Mäkelä [ 2023-06-30 ]

eworm, it is very strange that you are seeing a page number 0 also in the page frame and not only in the block descriptor. The smallest page number in an .ibd that can contain a secondary index page is 4. I think that we may have to debug this deeper, so please preserve a copy of this corrupted data for further analysis.

What was the last ALTER TABLE statement? Did it rebuild the table? Which server version number do you see in the .frm file?

od -Ax --endian=little -t u4 -j 51 -N 4 t_table.frm

That version number would change on any ALTER TABLE, including some that do not affect InnoDB at all, say, changing the default value of a column.

The GDB @ operator (for interpreting a pointer as an array) should be more than 20 years old. Would the following work?

print *entry.fields@2

or "manually"

print entry.fields[0]
print entry.fields[1]

Because the data lengths should be 7 and 4, I guess that the following should work:

p/x *(char*)entry.fields[0].data@7
p/x *(char*)entry.fields[1].data@4

Alternatively, if you have the data pointer address, you could do

x/7x 0x…
x/4x 0x…

Yes, rebuilding the table should fix this, as should a slow shutdown of the 10.x server before upgrading. Dropping and re-adding the secondary indexes may cause trouble if you are using an older server version where MDEV-30009 had not been fixed.

Thank you for mentioning Galera cluster. If you used snapshot transfer (SST), the data may have been corrupted by that at some time of the past. Before MDEV-24845 was fixed, the mechanism to stop InnoDB writes on the donor node of wsrep_sst_method=rsync failed to block writes when using the default innodb_use_native_aio=1. Unfortunately, there also have been rather hard to repeat bugs in InnoDB crash recovery and mariadb-backup, which could cause data corruption, also when using wsrep_sst_method=mariabackup. Our internal stress testing is mostly conducted on 10.6 and later, and any bugs found by MDEV-18976 and other means will also be applied to 10.5. In 10.4 I know that there are some bugs in the log-based recovery, but there is no reproducible test case, and the problems may be unfixable in the old log format, which MDEV-12353 and MDEV-14425 replaced in 10.5 and 10.8.

Basically, each time you perform Galera SST with other than wsrep_sst_method=mysqldump, you will copy any corruption from the donor node and potentially add more corruption.

Comment by Marko Mäkelä [ 2023-06-30 ]

In [^docker-mariadb@debug.service.log.gz] there are several records that look like this:

PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 0
 0: len 4; hex 000007f1; asc     ;;
 1: len 1; hex 00; asc  ;;
 2: len 4; hex 000000ff; asc     ;;
 3: len 16; hex 000900010c0f003c8021860300048000; asc        < !      ;;
 4: len 7;
 5: len 4;

Apart for page 0xff (255), there are some for pages 0x9f7e, 0xd2ea, 0x15175, and so on. All the secondary index keys that I paid attention to looked like K followed by 6 decimal digits, so I would assume that they are for the same index.

I think that it would be good to follow the page access patterns, with some breakpoints. The change buffer merge applies to secondary index leaf pages, and it never causes page splits or merges. I would like to have copies of the data pages that are being accessed right before the crash.

I suppose it will crash in the same way each time. First, I would suggest the following commands to find out which pages were accessed right before the crash:

set print frame-arguments all
break buf_page_get_gen
cond 1 page_id.m_id>>32 == 0x7f1
command 1
continue
end
continue

The page_id.m_id that GDB should display as a function parameter should be 0x7f1<<32 + page number, that is, 8731668512768 + page number.

In some builds, the function parameters may be garbage until after the function preamble has been executed. Should that be the case, you would have to set the breakpoint on a specific source code line inside the function, say, buf0buf.cc:2649 in mariadb-11.0.2.

For sanity checking, I would like to see the last 3 page numbers that were accessed before the crash. I would also like to see a page dump of the last page before there is any attempt to apply any changes to it. That you can send to me by email, to protect your data.

A page dump can be obtained from the buffer pool like this, if a breakpoint in buf_page_get_gen has been hit:

finish
dump binary memory page.bin $1.page.frame $1.page.frame+srv_page_size

You will have to replace $1 with whatever GDB reports as the return value of the function.

Comment by Christian Hesse [ 2023-07-03 ]

We switched the wsrep_sst_method from rsync to mariabackup half way the cycle with MariaDB 10.5.6.

Comment by Christian Hesse [ 2023-07-03 ]

The last ALTER TABLE to that table (at 10.5.11) change column25 from varchar(20) to varchar(50).

Comment by Marko Mäkelä [ 2023-07-03 ]

Thank you. Extending a VARCHAR (MDEV-16849) does not in this case force a table rebuild.

The corruption could very well be blamed to wsrep_sst_method=rsync malfunctioning in the past.

I would still like to make sure that the upgrade procedure tolerates some corruption like this. If all else fails, we could write a log message, saying that there is some corruption, and you should fix that by OPTIMIZE TABLE or ALTER TABLE…FORCE. This should always work, because the change buffer only covers secondary indexes, not the clustered index, where the data is stored.

Comment by Chris Page [ 2023-07-03 ]

I have run into the same issue and can confirm that running `SET GLOBAL innodb_fast_shutdown=0;` prior to stopping my MariaDB 10 instance allows me to successfully install MariaDB 11.

However, can I confirm this is a viable workaround to the issue, i.e. it won't cause any future consistency issues?

Thanks,
Chris.

Comment by Marko Mäkelä [ 2023-07-03 ]

chrispage1, yes, a prior slow shutdown should empty the change buffer. Your experience suggests that no corruption might not be involved at all, but just some incorrect logic in the upgrade procedure. Originally, I was considering to simply refuse upgrade if the change buffer is not empty. I thought that it would be more user-friendly to implement an upgrade procedure. I hope that I will soon get more details from someone, so that the problem can be fixed.

Comment by Chris Page [ 2023-07-03 ]

What else do you need for testing? This test was on a snapshot and I have two other servers to upgrade.

Comment by Marko Mäkelä [ 2023-07-03 ]

chrispage1, I would need an interactive debugging session with access to the data, so that I can check some data pages that were accessed in an attempted upgrade that leads to this failure.

Comment by Chris Page [ 2023-07-03 ]

I could provide you with SSH access over a screen session? It's the least I could do to support MariaDB. Unfortunately I won't be able to send my data off as there is some pretty sensitive stuff in there.

Comment by Marko Mäkelä [ 2023-07-04 ]

I had an interactive debugging session with eworm today. I did not see any obvious sign of data corruption in that. The fatal error would occur on the very first record that is being inserted into page 255:

PHYSICAL RECORD: n_fields 6; 1-byte offsets; info bits 0
 0: len 4; hex 000007f1; asc     ;;
 1: len 1; hex 00; asc  ;;
 2: len 4; hex 000000ff; asc     ;;
 3: len 16; hex 000000010c0f003c8021860300048000; asc        < !      ;;
 4: len 7; hex 4b313030323438; asc K100248;;
 5: len 4; hex 80b3dbe1; asc     ;;

That page only contains the key K100248 with various PRIMARY KEY column values. One of the inserts into the page should probably invoke page reorganization (to empty the PAGE_FREE list and to reduce the PAGE_GARBAGE to 0), but the code crashed already on the INSERT. Only the PAGE_MAX_TRX_ID had been updated from the change buffer leaf page to the secondary index leaf page.

In other news, mleich informed me that he finally reproduced this bug. I should get access to his data shortly, so that I can debug and fix this more conveniently, having access to the full data.

Comment by Matthias Leich [ 2023-07-04 ]

Upgrade from
    mariadb-10.11.4, origin/bb-10.11-release 4e2b93dffef2414a11ca5edc8d215f57ee5010e5 2023-06-05T19:04:58+02:00
to
   mariadb-11.0.2, origin/bb-11.0-release 0005f2f06c8e1aea4915887decad67885108a929 2023-06-05T19:27:00+02:00
 
The restart with mariadb-11.0.2 fails with  [ERROR] [FATAL] InnoDB: Unable to find charset-collation for 254.
pluto:/data/results/1688413161/TBR-1986-MDEV-31443$ _RR_TRACE_DIR=./1/rr/ rr replay --mark-stdio
 
RQG
====
# git clone https://github.com/mleich1/rqg --branch <pick the right branch> RQG
#
# GIT_SHOW: HEAD -> master, origin/master, origin/HEAD 0fb21c5d3820319b143188a54734f9dbae946b23 2023-07-03T19:30:45+02:00
# rqg.pl  : Version 4.2.1 (2022-12)
#
# $RQG_HOME/rqg.pl \
# --grammar=conf/engines/many_indexes.yy \
# --gendata=conf/engines/many_indexes.zz \
# --max_gd_duration=900 \
# --mysqld=--transaction-isolation=REPEATABLE-READ \
# --validator=SelectStability \
# --upgrade-test \
# --mysqld=--loose-innodb_lock_schedule_algorithm=fcfs \
# --mysqld=--loose-idle_write_transaction_timeout=0 \
# --mysqld=--loose-idle_transaction_timeout=0 \
# --mysqld=--loose-idle_readonly_transaction_timeout=0 \
# --mysqld=--connect_timeout=60 \
# --mysqld=--interactive_timeout=28800 \
# --mysqld=--slave_net_timeout=60 \
# --mysqld=--net_read_timeout=30 \
# --mysqld=--net_write_timeout=60 \
# --mysqld=--loose-table_lock_wait_timeout=50 \
# --mysqld=--wait_timeout=28800 \
# --no_mask \
# --queries=10000000 \
# --seed=random \
# --reporters=None \
# --reporters=ErrorLog \
# --reporters=Deadlock \
# --reporters=Upgrade \
# --validators=None \
# --mysqld=--log_output=none \
# --mysqld=--log_bin_trust_function_creators=1 \
# --mysqld=--loose-debug_assert_on_not_freed_memory=0 \
# --engine=InnoDB \
# --restart_timeout=240 \
# --mysqld=--plugin-load-add=file_key_management.so \
# --mysqld=--loose-file-key-management-filename=$RQG_HOME/conf/mariadb/encryption_keys.txt \
# --mysqld=--plugin-load-add=provider_lzo.so \
# --mysqld=--plugin-load-add=provider_bzip2.so \
# --mysqld=--plugin-load-add=provider_lzma.so \
# --mysqld=--plugin-load-add=provider_snappy.so \
# --mysqld=--plugin-load-add=provider_lz4.so \
# --mysqld=--loose-innodb_compression_level=1 \
# --duration=300 \
# --mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
# --mysqld=--lock-wait-timeout=15 \
# --mysqld=--innodb-lock-wait-timeout=10 \
# --mysqld=--loose-innodb_read_only_compressed=OFF \
# --mysqld=--innodb_stats_persistent=on \
# --mysqld=--innodb_adaptive_hash_index=on \
# --mysqld=--innodb_random_read_ahead=OFF \
# --mysqld=--log-bin \
# --mysqld=--sync-binlog=1 \
# --mysqld=--loose-innodb_evict_tables_on_commit_debug=on \
# --mysqld=--loose-max-statement-time=30 \
# --threads=9 \
# --mysqld=--innodb-use-native-aio=0 \
# --mysqld=--loose-gdb \
# --mysqld=--loose-debug-gdb \
# --rr=Extended \
# --rr_options=--wait \
# --mysqld=--innodb_undo_log_truncate=OFF \
# --mysqld=--loose_innodb_change_buffering=changes \
# --mysqld=--innodb_rollback_on_timeout=ON \
# --mysqld=--innodb_page_size=4K \
# --mysqld=--innodb-buffer-pool-size=5M \
# <local settings>
 
Four of ~ 10000 RQG random tests, 124 with the grammar many_indexes.yy failed with that bad effect.
Hence some automatic test simplification could be applied.

Comment by Marko Mäkelä [ 2023-07-04 ]

For some reason, the upgrade testing for MDEV-29694 did not cover upgrading from a dataset where there were entries for indexes on string columns. I guess that whatever I used in sysbench was based on integer columns, or on string columns on the original default collation latin1_swedish_ci.

The rewritten function dtype_new_read_for_order_and_null_size() was actually losing the charset-collation information, by shifting a 32-bit variable twice by 16 bits. In fact, my compiler (clang-16 -O2) was smart enough to optimize away the entire line that the patch below is changing:

diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc
index 5303b592c71..f8043f6e05c 100644
--- a/storage/innobase/ibuf/ibuf0ibuf.cc
+++ b/storage/innobase/ibuf/ibuf0ibuf.cc
@@ -156,7 +156,7 @@ dtype_new_read_for_order_and_null_size(
 		<< 16;
 
 	if (dtype_is_string_type(type->mtype)) {
-		type->prtype |= charset_coll << 16;
+		type->prtype |= charset_coll;
 
 		if (charset_coll == 0) {
 			/* This insert buffer record was inserted before

There are also some other issues that mleich’s testing revealed. I pushed a tentative fix for more extensive internal testing. Ready-built packages of that (if anyone in the community want to test this) should be available in https://ci.mariadb.org/36376/ in some time, for some time.

There appears to be something wrong with eworm’s data set, because a slow shutdown of MariaDB Server 10.9.7 would hang in change buffer merge. That we will debug separately. It would be interesting to know if this fix would allow that dataset to be upgraded to 11.0.

Comment by Marko Mäkelä [ 2023-07-04 ]

I reproduced the issue with data.tar.xz (using innodb_page_size=4k). There is one more bug that my initial fix missed, resulting in a debug assertion failure and possibly broken crash recovery if the server is killed and restarted during the upgrade. The dataset for that appears to be too large to be included here.

The packages for this follow-up should appear in https://ci.mariadb.org/36377/.

Comment by Christian Hesse [ 2023-07-04 ]

Tested MariaDB 11.0.2 with 1b198aa2ff9a7f4d2a580c7915798056fdbdbe66 and cbdd6179cf870bc13b12b90fb365d7aca46cc55a on top. It does no longer crash, but bail out with an error:

2023-07-04 16:20:57 0 [Note] InnoDB: Upgrading the change buffer
2023-07-04 16:21:02 0 [ERROR] InnoDB: Unable to upgrade the change buffer
2023-07-04 16:21:02 0 [ERROR] InnoDB: Plugin initialization aborted with error Data structure corruption

Comment by Daniel Black [ 2023-07-05 ]

The container quay.io/mariadb-foundation/mariadb-devel:11.0-mdev-31443-pkgtest was also generated from the build and available for testing.

Comment by Marko Mäkelä [ 2023-07-05 ]

Today, I hope to figure out what is triggering the corruption report for eworm. I believe that it may be genuine corruption, because a slow shutdown of MariaDB 10.9.7 with a copy of that database would hang in change buffer merge, periodically spitting out "progress" messages from srv_shutdown_print(), with unchanged content. Possibly, I could come up with some logic that when innodb_force_recovery=1 is set, such corruption will be ignored and the upgrade would go through. The impact of that would be that some secondary indexes of some InnoDB tables could end up being corrupted.

Comment by Christian Hesse [ 2023-07-05 ]

Oh, by the way... Anybody interested in Arch Linux packages for testing?

Comment by Marko Mäkelä [ 2023-07-05 ]

The upgrade actually went through fine after applying one more fix:

diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc
index 04d45f2e9d0..a13af290d8d 100644
--- a/storage/innobase/ibuf/ibuf0ibuf.cc
+++ b/storage/innobase/ibuf/ibuf0ibuf.cc
@@ -709,7 +709,7 @@ static dberr_t ibuf_merge(fil_space_t *space, btr_cur_t *cur, mtr_t *mtr)
     rec_t *rec= cur->page_cur.rec;
     ulint n_fields= rec_get_n_fields_old(rec);
 
-    if (n_fields <= IBUF_REC_FIELD_USER + 1 || rec[4])
+    if (n_fields < IBUF_REC_FIELD_USER + 1 || rec[4])
       return DB_CORRUPTION;
 
     n_fields-= IBUF_REC_FIELD_USER;

This consistency check was too strict. In [^docker-mariadb@debug.service.log.gz] there are several buffered entries to a redundant secondary index, like this:

PHYSICAL RECORD: n_fields 5; 1-byte offsets; info bits 0
 0: len 4; hex 00007e9e; asc   ~ ;;
 1: len 1; hex 00; asc  ;;
 2: len 4; hex 0001d2fd; asc     ;;
 3: len 10; hex 00000001860300048000; asc           ;;
 4: len 4; hex 814609c0; asc  F  ;;

On the table, there is a PRIMARY KEY (actually a NOT NULL UNIQUE column) as well as a non-unique secondary index defined on the PRIMARY KEY column. We can apply these changes just fine.

We do not know yet what caused the slow shutdown of MariaDB Server 10.9.7 to hang on this data. I suggest the following steps:

  1. Determine the complete list of tablespace IDs:

    gzip -dc docker-mariadb\@debug.service.log.gz|sed -ne 's/^ 0: len 4; hex 0*\([0-9a-f]*\);.*/\1/p'|uniq
    

  2. Find out the corresponding table names (the list is truncated in the example below):

    SELECT name FROM information_schema.innodb_sys_tables WHERE space in(0x7b9,0x7f1,0x7ff,0x94b,0x953,0x972,0x97a,0x97c,0x97d);
    

  3. Execute CHECK TABLE…QUICK on all these tables to force a change buffer merge. Drop and re-add any reported corrupted indexes, or execute OPTIMIZE TABLE to rebuild such tables.
  4. Execute a normal (non-slow) shutdown.
  5. Dump the contents of the change buffer (using a debug build).
  6. Restart a non-debug server. If the change buffer was not empty, look up the tables corresponding to the tablespace IDs, and execute OPTIMIZE TABLE or ALTER TABLE…FORCE.
Comment by Marko Mäkelä [ 2023-07-05 ]

There are some readily built packages in https://ci.mariadb.org/36411/ for the final fix.

Comment by Christian Hesse [ 2023-07-05 ]

Pushed to Arch Linux testing repositories as mariadb 11.0.2-2...

Comment by Christian Hesse [ 2023-07-05 ]

We had another debug session for MariaDB 10.9.7 slow shutdown hang. We could not find the exact issue, but solved it anyway: With the extracted tablespace IDs I could get the table names from information_schema.innodb_sys_tables, then run OPTIMIZE TABLE on all of these. Finally the slow shutdown succeeded.

Comment by Artem Russakovskii [ 2023-08-02 ]

Is the fix above going to also work for MDEV-31579?

Comment by Artem Russakovskii [ 2023-08-14 ]

11.0.3 is out https://mariadb.com/kb/en/mariadb-11-0-3-release-notes/.

Comment by Artem Russakovskii [ 2023-08-30 ]

I can confirm upgrading to 11.0.3 fixed the issue for me.

Comment by Artem Russakovskii [ 2023-08-30 ]

Aaand a few minutes in, the server started crashing. Not sure if related to this bug or not, but here: MDEV-32044.

Generated at Thu Feb 08 10:23:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.