[MDEV-29761]  Bulk insert fails to rollback during insert..select Created: 2022-10-11  Updated: 2022-11-21  Resolved: 2022-10-25

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.7, 10.8, 10.9, 10.10, 10.11
Fix Version/s: 10.7.7, 10.8.6, 10.9.4, 10.10.2, 10.11.1

Type: Bug Priority: Critical
Reporter: Roel Van de Paar Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 0
Labels: corruption, regression

Issue Links:
Problem/Incident
causes MDEV-30047 Memory leak on rollback of bulk insert Closed
Relates
relates to MDEV-29801 Inconsistent ER_TOO_BIG_ROWSIZE Open
relates to MDEV-24621 In bulk insert, pre-sort and build in... Closed
relates to MDEV-26453 Assertion `0' failed in row_upd_sec_i... Closed

 Description   

Looks very similar to the previously fixed MDEV-26453, and also to MDEV-28190, yet this one is 10.7+ only. Also possible connection with MDEV-27744.

SET unique_checks=0,foreign_key_checks=0;
CREATE TABLE t1 (c INT) ENGINE=InnoDB;
ALTER TABLE t1 ADD CONSTRAINT cst1 UNIQUE INDEX (c);
INSERT t1 SELECT 1 FROM seq_1_to_15;  # 15 Rows affected
SELECT * FROM t1;  # 0 Rows
DELETE FROM t1;

Leads to:

10.11.0 6ebdd3013a18b01dbecec76b870810329eb76586 (Debug)

mysqld: /test/10.11_dbg/storage/innobase/row/row0upd.cc:1969: dberr_t row_upd_sec_index_entry(upd_node_t*, que_thr_t*): Assertion `0' failed.

10.11.0 6ebdd3013a18b01dbecec76b870810329eb76586 (Debug)

Core was generated by `/test/MD190922-mariadb-10.11.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
[Current thread is 1 (Thread 0x152434cd0700 (LWP 875027))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000015245b45a859 in __GI_abort () at abort.c:79
#2  0x000015245b45a729 in __assert_fail_base (fmt=0x15245b5f0588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5605426b33c6 "0", file=0x56054275e7b8 "/test/10.11_dbg/storage/innobase/row/row0upd.cc", line=1969, function=<optimized out>) at assert.c:92
#3  0x000015245b46bfd6 in __GI___assert_fail (assertion=assertion@entry=0x5605426b33c6 "0", file=file@entry=0x56054275e7b8 "/test/10.11_dbg/storage/innobase/row/row0upd.cc", line=line@entry=1969, function=function@entry=0x56054275f798 "dberr_t row_upd_sec_index_entry(upd_node_t*, que_thr_t*)") at assert.c:101
#4  0x00005605420e9203 in row_upd_sec_index_entry (node=node@entry=0x1523c0035fd0, thr=thr@entry=0x1523c0036298) at /test/10.11_dbg/storage/innobase/row/row0upd.cc:1969
#5  0x00005605420ec55a in row_upd_sec_step (thr=0x1523c0036298, node=0x1523c0035fd0) at /test/10.11_dbg/storage/innobase/row/row0upd.cc:2094
#6  row_upd (thr=0x1523c0036298, node=0x1523c0035fd0) at /test/10.11_dbg/storage/innobase/row/row0upd.cc:2818
#7  row_upd_step (thr=thr@entry=0x1523c0036298) at /test/10.11_dbg/storage/innobase/row/row0upd.cc:2933
#8  0x0000560542093dfe in row_update_for_mysql (prebuilt=0x1523c00355e0) at /test/10.11_dbg/storage/innobase/row/row0mysql.cc:1686
#9  0x0000560541f0ed1a in ha_innobase::delete_row (this=0x1523c0034da0, record=0x1523c00330a8 "\375\001") at /test/10.11_dbg/storage/innobase/handler/ha_innodb.cc:8706
#10 0x0000560541c09b19 in handler::ha_delete_row (this=0x1523c0034da0, buf=0x1523c00330a8 "\375\001") at /test/10.11_dbg/sql/handler.cc:7715
#11 0x00005605418b97f5 in TABLE::delete_row (this=0x1523c001f218) at /test/10.11_dbg/sql/sql_delete.cc:281
#12 0x00005605418b7c02 in mysql_delete (thd=thd@entry=0x1523c0000d48, table_list=0x1523c00132c0, conds=<optimized out>, order_list=order_list@entry=0x1523c0005a68, limit=18446744073709551615, options=<optimized out>, result=<optimized out>) at /test/10.11_dbg/sql/sql_delete.cc:842
#13 0x00005605419145e0 in mysql_execute_command (thd=thd@entry=0x1523c0000d48, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /test/10.11_dbg/sql/sql_limit.h:85
#14 0x000056054190003c in mysql_parse (thd=thd@entry=0x1523c0000d48, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x152434ccf330) at /test/10.11_dbg/sql/sql_parse.cc:8037
#15 0x000056054190d66d in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x1523c0000d48, packet=packet@entry=0x1523c000aed9 "DELETE FROM t1", packet_length=packet_length@entry=14, blocking=blocking@entry=true) at /test/10.11_dbg/sql/sql_class.h:1345
#16 0x000056054190fd97 in do_command (thd=0x1523c0000d48, blocking=blocking@entry=true) at /test/10.11_dbg/sql/sql_parse.cc:1407
#17 0x0000560541a73fb9 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x5605444a0f78, put_in_cache=put_in_cache@entry=true) at /test/10.11_dbg/sql/sql_connect.cc:1416
#18 0x0000560541a744c3 in handle_one_connection (arg=0x5605444a0f78) at /test/10.11_dbg/sql/sql_connect.cc:1318
#19 0x000015245b96b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#20 0x000015245b557133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The crash is confirmed present in:
MariaDB: 10.7.6 (dbg), 10.8.5 (dbg), 10.9.3 (dbg), 10.10.2 (dbg), 10.11.0 (dbg)

The error is confirmed present in:
MariaDB: 10.7.6 (dbg), 10.7.6 (opt), 10.8.5 (dbg), 10.8.5 (opt), 10.9.3 (dbg), 10.9.3 (opt), 10.10.2 (dbg), 10.10.2 (opt), 10.11.0 (dbg), 10.11.0 (opt)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.3.37 (dbg), 10.3.37 (opt), 10.4.27 (dbg), 10.4.27 (opt), 10.5.18 (dbg), 10.5.18 (opt), 10.6.10 (dbg), 10.6.10 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.51 (dbg), 5.6.51 (opt), 5.7.38 (dbg), 5.7.38 (opt), 8.0.29 (dbg), 8.0.29 (opt)



 Comments   
Comment by Marko Mäkelä [ 2022-10-11 ]

Based on the test case, it seems more likely that this was caused by MDEV-24621 or subsequent changes in that area. It is notable that the fix of MDEV-29570 had not been merged to 10.11 as of 6ebdd3013a18b01dbecec76b870810329eb76586.

Could this simply be a duplicate of MDEV-29570?

Comment by Roel Van de Paar [ 2022-10-11 ]

It is not immediately obvious how to bisect which commit introduced this, as for example even a build of 27/5/22 (10.10, debug, b3df1ec97aacc27678c44eefe56ea8680456d608) will crash with the same stack, but this could have been the previous bug as well.
Now checking if it is fixed by recent commit as per marko (thanks!).

Comment by Marko Mäkelä [ 2022-10-11 ]

I can repeat this with the latest 10.7. Here is a simplified test case:

--source include/have_innodb.inc
CREATE TEMPORARY TABLE t(c INT);
INSERT INTO t VALUES(1),(1);
CREATE TABLE t1 (c INT UNIQUE) ENGINE=InnoDB;
SET unique_checks=0,foreign_key_checks=0;
INSERT INTO t1 SELECT 1 FROM t;
CHECK TABLE t1;
SELECT * FROM t1;
DROP TABLE t1;

10.7 b6ebadaa66ee68b1880c0e10669543d1ba058c18

Table	Op	Msg_type	Msg_text
test.t1	check	Warning	InnoDB: Index 'c' contains 0 entries, should be 2.

Comment by Roel Van de Paar [ 2022-10-14 ]

Thank you marko. Confirmed that 10.7 at b6ebadaa66ee68b1880c0e10669543d1ba058c18 reproduces crash with original testcase.
With the updated testcase, when executed at the CLI, I get:

10.7.7 b6ebadaa66ee68b1880c0e10669543d1ba058c18 (Debug)

10.7.7-dbg>CHECK TABLE t1;
+---------+-------+----------+----------------------------------------------------+
| Table   | Op    | Msg_type | Msg_text                                           |
+---------+-------+----------+----------------------------------------------------+
| test.t1 | check | Warning  | InnoDB: Index 'c' contains 0 entries, should be 2. |
| test.t1 | check | error    | Corrupt                                            |
+---------+-------+----------+----------------------------------------------------+
2 rows in set (0.001 sec)
 
10.7.7-dbg>SELECT * FROM t1;
ERROR 1712 (HY000): Index t1 is corrupted

And:

10.7.7 b6ebadaa66ee68b1880c0e10669543d1ba058c18 (Debug)

2022-10-14 16:04:12 4 [ERROR] InnoDB: Flagged corruption of `c` in table `test`.`t1` in CHECK TABLE; Wrong count
2022-10-14 16:04:12 4 [ERROR] Got error 180 when reading table './test/t1'

$ ./bin/perror 180
MariaDB error code 180: Index corrupted

Comment by Thirunarayanan Balathandayuthapani [ 2022-10-17 ]

Patch is in bb-10.7-MDEV-29761

Comment by Marko Mäkelä [ 2022-10-18 ]

This looks mostly correct to me. Should trx_t::bulk_insert_apply() attempt to process all tables and return a combined error? Is there a multi-table scenario where the suggested patch would misbehave because it is only rolling back changes to one table, instead of rolling back to the oldest problematic change?

I would suggest a different name for the added member function:

  /** @return the first undo record that modified the table */
  undo_no_t get_first() const

If this fixes also MDEV-29801, please add also that test case to the regression suite.

Generated at Thu Feb 08 10:11:06 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.