[MDEV-19449] 1030: Got error 168 "Unknown (generic) error from engine" for valid TRUNCATE (temporary) TABLE Created: 2019-05-13  Updated: 2019-05-14  Resolved: 2019-05-14

Status: Closed
Project: MariaDB Server
Component/s: Data Definition - Temporary, Storage Engine - InnoDB
Affects Version/s: 10.2.24
Fix Version/s: 10.2.25

Type: Bug Priority: Major
Reporter: Matthias Leich Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: affects-tests-10.2

Attachments: File ml_19.test    
Issue Links:
Problem/Incident
is caused by MDEV-13564 TRUNCATE TABLE and undo tablespace tr... Closed
Relates
relates to MDEV-17167 InnoDB: Failing assertion: table->get... Closed

 Description   

...
CREATE TEMPORARY TABLE t1 (col1 BIGINT NULL) ENGINE = InnoDB;
INSERT INTO t1 ( k ) SELECT id FROM t1 ;
ERROR 42S22: Unknown column 'id' in 'field list'
TRUNCATE TABLE t1 ;
ALTER TABLE t1 CHANGE COLUMN id col2 BIGINT NULL;
ERROR 42S22: Unknown column 'id' in 't1'
TRUNCATE TABLE t1 ;
main.ml_19 'innodb'                      [ fail ]
        Test ended at 2019-05-13 13:56:45
 
CURRENT_TEST: main.ml_19
mysqltest: At line 10: query 'TRUNCATE TABLE t1 ' failed: 1030: Got error 168 "Unknown (generic) error from engine" from storage engine InnoDB
 
10.2.25 commit 8ce702aa90c174566f4ac950e85cc7570bf9b647
10.2.24 commit b30bbb7d9a1c3855c901b74735660e49ab64a2a8 2019-03-25
 
No issue in
10.3.15 commit 0c188d5efc452b576c46270e65b9db4746ee9dfe (pull 2019-05-13)



 Comments   
Comment by Anel Husakovic [ 2019-05-13 ]

It seems like table is "locked".
Here is bt from gdb:

2019-05-13  6:16:24 140736932443904 [ERROR] InnoDB: Cannot rename table 'tmp/#sql-ib30' to 'tmp/#sql-ib30' since the dictionary cache already contains 'tmp/#sql-ib30'.
 
Thread 32 "mysqld" hit Breakpoint 1, my_message_sql (error=1030, str=0x7fffdedd4330 "Got error 168 \"Unknown (generic) error from engine\" from storage engine InnoDB", MyFlags=0) at sql/mysqld.cc:3612
(gdb) bt
#0  my_message_sql (error=1030, str=0x7fffdedd4330 "Got error 168 \"Unknown (generic) error from engine\" from storage engine InnoDB", MyFlags=0) at sql/mysqld.cc:3612
#1  0x00005555564d563d in my_error (nr=1030, MyFlags=0) at mysys/my_error.c:125
#2  0x0000555555d9937d in handler::print_error (this=0x7fff6801e880, error=168, errflag=0) at sql/handler.cc:3735
#3  0x0000555555f5e17e in Sql_cmd_truncate_table::handler_truncate (this=0x7fff68010d98, thd=0x7fff68000cf8, table_ref=0x7fff68010780, is_tmp_table=true) at sql/sql_truncate.cc:244
#4  0x0000555555f5e637 in Sql_cmd_truncate_table::truncate_table (this=0x7fff68010d98, thd=0x7fff68000cf8, table_ref=0x7fff68010780) at sql/sql_truncate.cc:399
#5  0x0000555555f5e912 in Sql_cmd_truncate_table::execute (this=0x7fff68010d98, thd=0x7fff68000cf8) at sql/sql_truncate.cc:495
#6  0x0000555555b1d6a3 in mysql_execute_command (thd=0x7fff68000cf8) at sql/sql_parse.cc:6233
#7  0x0000555555b227de in mysql_parse (thd=0x7fff68000cf8, rawbuf=0x7fff680106b0 "TRUNCATE TABLE t1", length=17, parser_state=0x7fffdedd6230, is_com_multi=false, is_next_command=false) at sql/sql_parse.cc:8020
#8  0x0000555555b1000e in dispatch_command (command=COM_QUERY, thd=0x7fff68000cf8, packet=0x7fff680083f9 "TRUNCATE TABLE t1", packet_length=17, is_com_multi=false, is_next_command=false) at sql/sql_parse.cc:1832
#9  0x0000555555b0e964 in do_command (thd=0x7fff68000cf8) at sql/sql_parse.cc:1386
#10 0x0000555555c616c9 in do_handle_one_connection (connect=0x5555579aea58) at sql/sql_connect.cc:1335
#11 0x0000555555c61449 in handle_one_connection (arg=0x5555579aea58) at sql/sql_connect.cc:1241
#12 0x00007ffff72f26db in start_thread (arg=0x7fffdedd7700) at pthread_create.c:463
#13 0x00007ffff66dc88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment by Matthias Leich [ 2019-05-13 ]

Hi Anel,
yes, some superfluous lock would quite good fit to my observations during simplification of the test.
I tried to omit the INSERT and the ALTER (both fail anyway) but than the bad effect simply disappeared.

Comment by Marko Mäkelä [ 2019-05-13 ]

Note: TEMPORARY TABLE never uses native ALTER TABLE, always ALGORITHM=COPY.

I suspect that something in ALTER TABLE is corrupting the handler::table_share so that TRUNCATE in InnoDB fails (and preserves the old contents of the table).

Comment by Marko Mäkelä [ 2019-05-13 ]

I missed the following message:

2019-05-13  6:16:24 140736932443904 [ERROR] InnoDB: Cannot rename table 'tmp/#sql-ib30' to 'tmp/#sql-ib30' since the dictionary cache already contains 'tmp/#sql-ib30'.

The ml_19.test essentially performs two TRUNCATE for a temporary table. For some reason, the simplified variant will not fail:

CREATE TEMPORARY TABLE t1 (col1 BIGINT NULL) ENGINE = InnoDB;
TRUNCATE TABLE t1;
TRUNCATE TABLE t1;

Also, the test failed to reproduce the problem with 10.3 for me.

In any case, the TRUNCATE of TEMPORARY TABLE should avoid renaming the table. It should suffice to simply drop and create the table.

Comment by Marko Mäkelä [ 2019-05-14 ]

I did some more debugging. I am afraid that the problem could be unique to the 10.2 series.

If I remember correctly, MariaDB 10.3 allows multiple concurrent handles to temporary tables, but 10.2 is not supposed to allow that. The problem appears to be that at the time the TRUNCATE executes, there will be multiple open table handles to the temporary table. MDEV-13564 expects there to be only the current table handle. You can set a breakpoint and see the large reference counts:

break dict_table_rename_in_cache
run
display table->n_ref_count
display table->name
continue

The problem occurs when a table is renamed from the same name (#sql-ib19 in my case) to the same name. That should not be possible, but apparently an extraneous table handle for the temporary table is still referring to the temporary table before the first TRUNCATE TABLE operation. Because of the background drop queue, that table was not dropped yet, and also AddressSanitizer does not notice any foul play.

Here is a slightly edited and commented version of ml_19.test that reproduces the problem:

--source include/have_innodb.inc
CREATE TEMPORARY TABLE t1 (col1 BIGINT NULL) ENGINE = InnoDB;
--error ER_BAD_FIELD_ERROR
INSERT INTO t1 (no_such_col) SELECT * FROM t1;
--echo # the bogus column name can be before or after SELECT, or both
TRUNCATE TABLE t1 ;
--error ER_BAD_FIELD_ERROR
ALTER TABLE t1 CHANGE no_such_col1 col1 BIGINT NULL;
--echo # MODIFY no_such_col BIGINT NULL will not cause the failure
TRUNCATE TABLE t1 ;
--echo # wrongly tried to re-truncate the old table from before the first TRUNCATE

Comment by Marko Mäkelä [ 2019-05-14 ]

Backporting (cherry-picking) the MDEV-17167 fix to 10.2 fixes this test case.

Comment by Marko Mäkelä [ 2019-05-14 ]

I tried the test case on 10.3 with the MDEV-17167 fix reverted, and it reproduces the MDEV-17167 assertion failure:

10.3 5c93509aad6aa952c48f078557c86322dfc35692 after git revert bad2f1569da57c4a81cc84ec2f4a79924df9c8d6

InnoDB: Failing assertion: table->get_ref_count() == 0
#7  0x00005564064216c5 in dict_table_remove_from_cache_low (table=0x7f0698006be8, lru_evict=0) at /mariadb/10.3/storage/innobase/dict/dict0dict.cc:1978
#8  0x0000556406421e61 in dict_table_remove_from_cache (table=0x7f0698006be8) at /mariadb/10.3/storage/innobase/dict/dict0dict.cc:2061
#9  0x00005564062ab5b0 in row_drop_table_for_mysql (name=0x7f07086f1440 "mysqld.1/#sql-ib20", trx=0x7f07091f21c8, sqlcom=SQLCOM_TRUNCATE, create_failed=false, nonatomic=true) at /mariadb/10.3/storage/innobase/row/row0mysql.cc:3367
#10 0x0000556406171461 in ha_innobase::delete_table (this=0x7f06980a76d0, name=0x7f06980a8d38 "mysqld.1/#sql-ib20", sqlcom=SQLCOM_TRUNCATE) at /mariadb/10.3/storage/innobase/handler/ha_innodb.cc:12921
#11 0x000055640615d26b in ha_innobase::truncate (this=0x7f06980a76d0) at /mariadb/10.3/storage/innobase/handler/ha_innodb.cc:13342

Generated at Thu Feb 08 08:51:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.