[MDEV-12023] Assertion failure sym_node->table != NULL on startup Created: 2017-02-08  Updated: 2020-08-25  Resolved: 2018-10-30

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB
Affects Version/s: 10.0, 10.1, 10.2, 10.3
Fix Version/s: 10.0.37, 10.3.11, 10.1.37, 10.2.19

Type: Bug Priority: Major
Reporter: Andrii Nikitin (Inactive) Assignee: Marko Mäkelä
Resolution: Fixed Votes: 1
Labels: None

Attachments: File 1.sh.err    
Issue Links:
Relates
relates to MDEV-14585 Automatically remove #sql- tables in ... Closed
relates to MDEV-17296 MariaDB fails to start after update: ... Closed

 Description   

When I use .sh test from https://bugs.launchpad.net/percona-xtrabackup/+bug/1399471 and ignore_builtin_innodb
plugin_load_add=ha_innodb
in my.cnf - I get reliable crash :

InnoDB: Apply batch completed
2017-02-08 09:34:38 7f4f19145780  InnoDB: Assertion failure in thread 139977699907456 in file pars0pars.cc line 865
InnoDB: Failing assertion: sym_node->table != NULL
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
170208  9:34:38 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.1.21-MariaDB
key_buffer_size=0
read_buffer_size=131072
max_used_connections=0
max_threads=1
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5297 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x48400
/farm/m0-bb-10.1-wlad-xtrabackup/build/extra/xtrabackup//xtrabackup(my_print_stacktrace+0x29)[0x55b98f6df519]
/farm/m0-bb-10.1-wlad-xtrabackup/build/extra/xtrabackup//xtrabackup(handle_fatal_signal+0x2f5)[0x55b98f2c9395]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f4f18d23390]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7f4f171b7428]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f4f171b902a]
mysys/stacktrace.c:268(my_print_stacktrace)[0x55b98f612ca6]
pars/pars0pars.cc:1282(pars_update_statement(upd_node_t*, sym_node_t*, void*))[0x55b98f614536]
xtradb/pars0grm.y:443(yyparse())[0x55b98f6a9bdb]
pars/pars0pars.cc:2247(pars_sql(pars_info_t*, char const*))[0x55b98f615be0]
que/que0que.cc:1260(que_eval_sql(pars_info_t*, char const*, unsigned long, trx_t*))[0x55b98f619163]
row/row0mysql.cc:4957(row_drop_table_for_mysql(char const*, trx_t*, bool, unsigned long, bool))[0x55b98f63e9c9]
row/row0mysql.cc:5211(row_mysql_drop_temp_tables())[0x55b98f6402b5]
log/log0recv.cc:3611(recv_recovery_rollback_active())[0x55b98f5efb3e]
srv/srv0start.cc:2715(innobase_start_or_create_for_mysql())[0x55b98f66a37c]
xtrabackup/xtrabackup.cc:1847(innodb_init())[0x55b98f08263a]
xtrabackup/xtrabackup.cc:6369(xtrabackup_prepare_func)[0x55b98f08edce]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f4f171a2830]
/farm/m0-bb-10.1-wlad-xtrabackup/build/extra/xtrabackup//xtrabackup(_start+0x29)[0x55b98f0a2959]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
inc/common.sh: line 102:  2296 Aborted                 "$@"

I couldn't reproduce the same with manual testing nor with 'identical' mtr tests (which i run with "-mysqld=ignore_builtin_innodb --mysqld=-plugin_load_add=ha_innodb" ) like below:

START TRANSACTION;
CREATE TEMPORARY TABLE t(id INT) ENGINE=InnoDB;
let $targetdir=$MYSQLTEST_VARDIR/tmp/backup;
exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --backup --target-dir=$targetdir;
exec $XTRABACKUP --defaults-file=$MYSQLTEST_VARDIR/my.cnf --prepare --target-dir=$targetdir;

Thus I am still filing this bug to let developer review if XtraDB patches from https://bugs.launchpad.net/percona-xtrabackup/+bug/1399471 should be applied to ha_innodb as well.



 Comments   
Comment by Vladislav Vaintroub [ 2017-09-12 ]

anikitin "to let developer review if XtraDB patches from https://bugs.launchpad.net/percona-xtrabackup/+bug/1399471 should be applied to ha_innodb as well".

we will not apply xtrabackup patches to innodb_plugin, this won't happen. The xtrabackup recovery will always be done with single version of innodb (i.e mariadb's version xtradb in 10.1, and marias version of innodb in 10..2+)

marko, can you take a shot on what seems to be a recovery bug?

Comment by Andrii Nikitin (Inactive) [ 2017-09-12 ]

I can't reproduce it anymore in 10.1 , so reassigning to myself and probably will just close after reviewing once again.

Comment by Marko Mäkelä [ 2017-09-12 ]

wlad, for what it is worth, the InnoDB in MariaDB before 10.2.2 and MySQL before 5.7 treats temporary tables mostly like persistent tables.

This is a strange failure that could affect also the MariaDB server prior to 10.2. In row_mysql_drop_temp_tables() there is this check for table!=NULL:

		table = dict_table_get_low(table_name);
 
		if (table) {
			row_drop_table_for_mysql(table_name, trx, FALSE, FALSE);
			trx_commit_for_mysql(trx);
		}

It seems to me that the sym_node->table that is NULL in pars_update_statement() is referring to a data dictionary table, not to the user table. The statement in the InnoDB internal SQL in row_drop_table_for_mysql() should be one of SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_TABLESPACES, SYS_DATAFILES. The other dictionary tables (SYS_TABLES, SYS_COLUMNS, SYS_INDEXES, SYS_FIELDS) are hard-coded, so those tables can never be missing.

I do not think that there have been any fixes in this area. I wonder how this could happen. I have two guesses:

  1. the backup was initiated while the server was starting up, before the non-core data dictionary tables had been fully created.
  2. the backup was from MySQL 5.5 where the tables SYS_TABLESPACES, SYS_DATAFILES did not exist.

MDEV-11816 prevents this failure in MariaDB 10.2 by removing the function row_mysql_drop_temp_tables() altogether.
MDEV-12548 prevents this type of failures in Mariabackup 10.2 by avoiding any operations that would generate redo log (or undo log, for that matter).

Comment by Marko Mäkelä [ 2017-09-13 ]

I had a look at the Percona bug and patches today. Indeed, the reported problem occurred when preparing a backup of a 5.5 database, which would lack the SYS_TABLESPACES and SYS_DATAFILES tables. The fix would avoid the ‘dictionary table does not exist’ error by using a different InnoDB SQL concept. The fix is unnecessarily avoiding access to the hard-coded SYS_INDEXES table.

Unlike Mariabackup, Percona XtraBackup aims to support multiple database versions. Mariabackup only supports the major server version that it was packaged with. Starting with Mariabackup 10.2, that is actually enforced by the redo log format identifier.

Without this bug, Mariabackup 10.1 should work with 5.5 datafiles. It would seem to make some sense to port the fix to 10.1 (XtraDB and Mariabackup, not necessarily InnoDB).

Comment by Marko Mäkelä [ 2018-10-26 ]

A similar crash is possible in InnoDB startup at least since MariaDB Server 10.0. Here is an error log excerpt:

10.2

2018-10-16 13:43:27 140340678678656 [Note] InnoDB: Initializing buffer pool, total size = 11G, instances = 8, chunk size = 128M
2018-10-16 13:43:28 140340678678656 [Note] InnoDB: Completed initialization of buffer pool
2018-10-16 13:43:28 140327376807680 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2018-10-16 13:43:28 140340678678656 [Note] InnoDB: Highest supported file format is Barracuda.
2018-10-16 13:43:28 140340678678656 [Note] InnoDB: Starting crash recovery from checkpoint LSN=18353004516364
2018-10-16 13:43:28 140340678678656 [Note] InnoDB: Starting final batch to recover 2 pages from redo log.
2018-10-16 13:43:29 140340678678656 [Note] InnoDB: Last binlog file '/mariadb/bin/logs/bin_logs/bin_log.035074', position 100498
2018-10-16 13:43:29 140340678678656 [ERROR] InnoDB: Operating system error number 2 in a file operation.
2018-10-16 13:43:29 140340678678656 [ERROR] InnoDB: The error means the system cannot find the path specified.
2018-10-16 13:43:29 140340678678656 [ERROR] InnoDB: If you are installing InnoDB, remember that you must create directories yourself, InnoDB does not create them.
2018-10-16 13:43:29 140340678678656 [ERROR] InnoDB: Cannot open datafile for read-only: './crane_db/#sql-30e3_41490.ibd' OS error: 71
2018-10-16 13:43:29 140340678678656 [ERROR] InnoDB: Operating system error number 2 in a file operation.
2018-10-16 13:43:29 140340678678656 [ERROR] InnoDB: The error means the system cannot find the path specified.
2018-10-16 13:43:29 140340678678656 [ERROR] InnoDB: If you are installing InnoDB, remember that you must create directories yourself, InnoDB does not create them.
2018-10-16 13:43:29 140340678678656 [ERROR] InnoDB: Could not find a valid tablespace file for `crane_db/#sql-30e3_41490`. Please refer to http://dev.mysql.com/doc/refman/5.7/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2018-10-16 13:43:29 140340678678656 [Warning] InnoDB: Ignoring tablespace for `crane_db`.`#sql-30e3_41490` because it could not be opened.
2018-10-16 13:43:30 140340678678656 [ERROR] InnoDB: Operating system error number 2 in a file operation.
2018-10-16 13:43:30 140340678678656 [ERROR] InnoDB: The error means the system cannot find the path specified.
2018-10-16 13:43:30 140340678678656 [ERROR] InnoDB: If you are installing InnoDB, remember that you must create directories yourself, InnoDB does not create them.
2018-10-16 13:43:30 140340678678656 [ERROR] InnoDB: Cannot open datafile for read-only: './crane_db/#sql-30e3_41490.ibd' OS error: 71
2018-10-16 13:43:30 140340678678656 [ERROR] InnoDB: Operating system error number 2 in a file operation.
2018-10-16 13:43:30 140340678678656 [ERROR] InnoDB: The error means the system cannot find the path specified.
2018-10-16 13:43:30 140340678678656 [ERROR] InnoDB: If you are installing InnoDB, remember that you must create directories yourself, InnoDB does not create them.
2018-10-16 13:43:30 140340678678656 [ERROR] InnoDB: Could not find a valid tablespace file for `crane_db/#sql-30e3_41490`. Please refer to http://dev.mysql.com/doc/refman/5.7/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2018-10-16 13:43:30 0x7fa39c4cd880  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.14/storage/innobase/pars/pars0pars.cc line 817
InnoDB: Failing assertion: sym_node->table != NULL

Comment by Marko Mäkelä [ 2018-10-29 ]

Starting with MDEV-14585 (MariaDB 10.3, or MariaDB 10.2.19 with innodb_safe_truncate=ON), InnoDB startup attempts to drop tables whose name starts with #sql.
I would assume that when the #sql*.ibd file has been manually deleted by the user, but the table exists in the InnoDB SYS_TABLES, then InnoDB would crash on startup.

Comment by Thirunarayanan Balathandayuthapani [ 2018-10-29 ]

Test case in 10.1:

--source include/have_innodb.inc
let $MYSQLD_DATADIR= `select @@datadir`;
let datadir= `select @@datadir`;
 
create table `#mysql50##sql-ib-something` (a INT PRIMARY KEY) ENGINE=InnoDB;
 
set global innodb_fast_shutdown=0;
--source include/shutdown_mysqld.inc

The above test case should be executed in 10.1. Move the data directory to 10.2 and start the server with default configuration.
Start up will crash with the following stack trace:

#0  0x00007ffff5754428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff575602a in __GI_abort () at abort.c:89
#2  0x0000555557a81335 in ut_dbg_assertion_failed (expr=0x55555870f240 "sym_node->table != NULL", file=0x55555870e9a0 "/home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/pars/pars0pars.cc", line=817) at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/ut/ut0dbg.cc:61
#3  0x00005555577976d5 in pars_retrieve_table_def (sym_node=0x62500001d048) at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/pars/pars0pars.cc:817
#4  0x0000555557799356 in pars_update_statement (node=0x62500001d130, cursor_sym=0x0, search_cond=0x62500001d418) at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/pars/pars0pars.cc:1226
#5  0x0000555557df07be in yyparse () at pars0grm.y:435
#6  0x000055555779ddf6 in pars_sql (info=0x617000008288, str=0x61d000038480 "PROCEDURE DROP_TABLE_PROC () IS\nsys_foreign_id CHAR;\ntable_id CHAR;\nindex_id CHAR;\nforeign_id CHAR;\nspace_id INT;\nfound INT;\nDECLARE CURSOR cur_fk IS\nSELECT ID FROM SYS_FOREIGN\nWHERE FOR_NAME = :table"...) at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/pars/pars0pars.cc:2132
#7  0x00005555577aa769 in que_eval_sql (info=0x617000008288, sql=0x61d000038480 "PROCEDURE DROP_TABLE_PROC () IS\nsys_foreign_id CHAR;\ntable_id CHAR;\nindex_id CHAR;\nforeign_id CHAR;\nspace_id INT;\nfound INT;\nDECLARE CURSOR cur_fk IS\nSELECT ID FROM SYS_FOREIGN\nWHERE FOR_NAME = :table"..., reserve_dict_mutex=0, trx=0x7fffe0f84bf8) at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/que/que0que.cc:1217
#8  0x0000555557880842 in row_drop_table_for_mysql (name=0x617000006d88 "test/#sql-ib-something", trx=0x7fffe0f84bf8, drop_db=false, create_failed=0, nonatomic=true) at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/row/row0mysql.cc:3795
#9  0x000055555787ab5b in row_mysql_drop_garbage_tables () at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/row/row0mysql.cc:2777
#10 0x00005555576f26d3 in recv_recovery_rollback_active () at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/log/log0recv.cc:3490
#11 0x000055555799dfd1 in innobase_start_or_create_for_mysql () at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/srv/srv0start.cc:2554
#12 0x000055555754011a in innobase_init (p=0x615000002df0) at /home/thiru/mariarepo/mdev-12023/mdev-12023/storage/innobase/handler/ha_innodb.cc:4312
#13 0x0000555556be1e13 in ha_initialize_handlerton (plugin=0x621000029f08) at /home/thiru/mariarepo/mdev-12023/mdev-12023/sql/handler.cc:521
#14 0x0000555556544589 in plugin_initialize (tmp_root=0x7fffffffc9d0, plugin=0x621000029f08, argc=0x5555598cc280 <remaining_argc>, argv=0x6160000001b8, options_only=false) at /home/thiru/mariarepo/mdev-12023/mdev-12023/sql/sql_plugin.cc:1419

The problem is that In 10.1, there is no sys_virtual table. In 10.2, "dict_create_or_check_sys_virtual()" is called after "recv_recovery_rollback_active()".
That means, InnoDB tries to drop the table by removing the entry from sys_virtual table before sys_virtual table is created.

The following patch solves the issue:

diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc
index 2249b0c..9161c1d 100644
--- a/storage/innobase/row/row0mysql.cc
+++ b/storage/innobase/row/row0mysql.cc
@@ -3787,8 +3787,10 @@ row_drop_table_for_mysql(
                                "WHERE SPACE = space_id;\n";
                }
 
-               sql +=  "DELETE FROM SYS_VIRTUAL\n"
-                       "WHERE TABLE_ID = table_id;\n";
+               if (dict_sys->sys_virtual != NULL) {
+                       sql +=  "DELETE FROM SYS_VIRTUAL\n"
+                               "WHERE TABLE_ID = table_id;\n";
+               }
 
                sql += "END;\n";

Comment by Marko Mäkelä [ 2018-10-30 ]

The problem is that row_drop_table_for_mysql() assumes that all the InnoDB data dictionary tables exist. On upgrade, that may not be the case.

The table SYS_VIRTUAL was introduced in MariaDB Server 10.2 (and MySQL 5.7),
and the tables SYS_TABLESPACES, SYS_DATAFILES were introduced in 10.0 (5.6).

Also the tables SYS_FOREIGN, SYS_FOREIGN_COLS were not created in the very first published InnoDB version, but only starting with MySQL 3.23.44.

Orphan tables would be dropped before CREATE TABLE IF NOT EXISTS on these dictionary tables is performed. Therefore, when dropping a table, we must allow any of these non-core dictionary tables to be missing.

Comment by Marko Mäkelä [ 2018-10-30 ]

I ported this also to bb-10.2-compatibility.

Generated at Thu Feb 08 07:54:31 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.