[MDEV-32126] Assertion `!writer.checksum_len || writer.remains == 0' fails upon concurrent online ALTER and transactions with failing statements and binary log enabled Created: 2023-09-07  Updated: 2023-11-04  Resolved: 2023-11-04

Status: Closed
Project: MariaDB Server
Component/s: Data Definition - Alter Table
Affects Version/s: 11.2
Fix Version/s: 11.2.2

Type: Bug Priority: Blocker
Reporter: Elena Stepanova Assignee: Nikita Malyavin
Resolution: Fixed Votes: 0
Labels: regression

Issue Links:
Problem/Incident
is caused by MDEV-16329 Engine-independent online ALTER TABLE Closed

 Description   

Set to blocker as it's a bug in a new feature in a soon-to-be GA.

--source include/have_binlog_format_mixed.inc
--source include/have_innodb.inc
--source include/have_debug_sync.inc
 
CREATE TABLE t1 (a INT, b TEXT) ENGINE=InnoDB;
CREATE TABLE t2 (a INT, b INT, c CHAR(8), d TEXT, UNIQUE(a)) ENGINE=InnoDB;
 
--connect (con1,localhost,root,,test)
START TRANSACTION;
CREATE TEMPORARY TABLE tmp (id INT, PRIMARY KEY(id)) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1, REPEAT('x',8000)),(2, REPEAT('x',8000));
UPDATE t2 SET b = NULL ORDER BY b LIMIT 2;
INSERT INTO t1 VALUES (3, REPEAT('x',8000));
DELETE FROM t1;
INSERT INTO t2 VALUES (1,1,'f','e'),(1000,1000,'c','b');
COMMIT;
--send
  SET debug_sync= 'NOW WAIT_FOR go_trx';
 
--connection default
SET debug_sync= 'alter_table_online_progress SIGNAL go_trx WAIT_FOR go_alter';
--send
  ALTER TABLE t2 FORCE, ALGORITHM=COPY, LOCK=NONE;
 
--connection con1
--reap
START TRANSACTION;
DROP TEMPORARY TABLE IF EXISTS tmp;
--error ER_DUP_ENTRY
INSERT INTO t2 VALUES (3,3,'a','x'), (3,3,'a','x');
COMMIT;
set debug_sync= 'NOW SIGNAL go_alter';
 
--connection default
--reap
 
# Cleanup
DROP TABLE t1, t2;

11.2 9ad7c899

mariadbd: /data/src/11.2/sql/log.cc:8052: int Event_log::write_cache(THD*, IO_CACHE*): Assertion `!writer.checksum_len || writer.remains == 0' failed.
230907 20:29:34 [ERROR] mysqld got signal 6 ;
 
#9  0x00007f18bb453df2 in __GI___assert_fail (assertion=0x5583e5c76de0 "!writer.checksum_len || writer.remains == 0", file=0x5583e5c6bf20 "/data/src/11.2/sql/log.cc", line=8052, function=0x5583e5c76b60 "int Event_log::write_cache(THD*, IO_CACHE*)") at ./assert/assert.c:101
#10 0x00005583e425af27 in Event_log::write_cache (this=0x5583e7af1f28 <mysql_bin_log+8>, thd=0x62c000240218, cache=0x61b000040038) at /data/src/11.2/sql/log.cc:8052
#11 0x00005583e4262515 in MYSQL_BIN_LOG::write_transaction_or_stmt (this=0x5583e7af1f20 <mysql_bin_log>, entry=0x7f18a3249830, commit_id=0) at /data/src/11.2/sql/log.cc:9182
#12 0x00005583e425fc6f in MYSQL_BIN_LOG::trx_group_commit_leader (this=0x5583e7af1f20 <mysql_bin_log>, leader=0x7f18a3249830) at /data/src/11.2/sql/log.cc:8902
#13 0x00005583e425e773 in MYSQL_BIN_LOG::write_transaction_to_binlog_events (this=0x5583e7af1f20 <mysql_bin_log>, entry=0x7f18a3249830) at /data/src/11.2/sql/log.cc:8694
#14 0x00005583e425c5d1 in MYSQL_BIN_LOG::write_transaction_to_binlog (this=0x5583e7af1f20 <mysql_bin_log>, thd=0x62c000240218, cache_mngr=0x61b00003fe98, end_ev=0x7f18a3249cc0, all=true, using_stmt_cache=false, using_trx_cache=true, is_ro_1pc=true) at /data/src/11.2/sql/log.cc:8291
#15 0x00005583e4232bf1 in binlog_flush_cache (thd=0x62c000240218, cache_mngr=0x61b00003fe98, end_ev=0x7f18a3249cc0, all=true, using_stmt=false, using_trx=true, is_ro_1pc=true) at /data/src/11.2/sql/log.cc:1907
#16 0x00005583e42339f7 in binlog_commit_flush_trx_cache (thd=0x62c000240218, all=true, cache_mngr=0x61b00003fe98, ro_1pc=true) at /data/src/11.2/sql/log.cc:2006
#17 0x00005583e4236c1c in binlog_commit (thd=0x62c000240218, all=true, ro_1pc=true) at /data/src/11.2/sql/log.cc:2418
#18 0x00005583e3efc181 in commit_one_phase_2 (thd=0x62c000240218, all=true, trans=0x62c000243cc0, is_real_trans=true) at /data/src/11.2/sql/handler.cc:2131
#19 0x00005583e3efbdd7 in ha_commit_one_phase (thd=0x62c000240218, all=true) at /data/src/11.2/sql/handler.cc:2093
#20 0x00005583e3efa048 in ha_commit_trans (thd=0x62c000240218, all=true) at /data/src/11.2/sql/handler.cc:1887
#21 0x00005583e3ac52a3 in trans_commit (thd=0x62c000240218) at /data/src/11.2/sql/transaction.cc:266
#22 0x00005583e35d1925 in mysql_execute_command (thd=0x62c000240218, is_called_from_prepared_stmt=false) at /data/src/11.2/sql/sql_parse.cc:5403
#23 0x00005583e35e10d7 in mysql_parse (thd=0x62c000240218, rawbuf=0x6290002e9238 "COMMIT", length=6, parser_state=0x7f18a324b9f0) at /data/src/11.2/sql/sql_parse.cc:7811
#24 0x00005583e35b9491 in dispatch_command (command=COM_QUERY, thd=0x62c000240218, packet=0x6290002df219 "COMMIT", packet_length=6, blocking=true) at /data/src/11.2/sql/sql_parse.cc:1893
#25 0x00005583e35b61ce in do_command (thd=0x62c000240218, blocking=true) at /data/src/11.2/sql/sql_parse.cc:1406
#26 0x00005583e3a81fac in do_handle_one_connection (connect=0x608000003cb8, put_in_cache=true) at /data/src/11.2/sql/sql_connect.cc:1445
#27 0x00005583e3a8196d in handle_one_connection (arg=0x608000003cb8) at /data/src/11.2/sql/sql_connect.cc:1347
#28 0x00005583e46aa652 in pfs_spawn_thread (arg=0x617000008c98) at /data/src/11.2/storage/perfschema/pfs.cc:2201
#29 0x00007f18bb4a7fd4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#30 0x00007f18bb5285bc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81



 Comments   
Comment by Nikita Malyavin [ 2023-09-19 ]

The smaller test, leading to the same problem, however a different assertion, has been found:

--source include/have_binlog_format_mixed.inc
--source include/have_innodb.inc
--source include/have_debug_sync.inc
 
CREATE TEMPORARY TABLE tmp (id INT, PRIMARY KEY(id)) ENGINE=InnoDB;
CREATE TABLE t1 (a INT, b TEXT) ENGINE=InnoDB;
CREATE TABLE t2 (a INT, b INT, c CHAR(8), d TEXT, UNIQUE(a)) ENGINE=InnoDB;
INSERT INTO t2 VALUES (1,1,'f','e'),(1000,1000,'c','b');
--connection default
SET debug_sync= 'alter_table_online_before_lock SIGNAL go_trx WAIT_FOR go_alter';
send ALTER TABLE t2 FORCE, ALGORITHM=COPY, LOCK=NONE;
 
--connect (con1,localhost,root,,test)
 
SET debug_sync= 'now WAIT_FOR go_trx';
START TRANSACTION;
INSERT INTO t1 VALUES (3,'a');
--error ER_DUP_ENTRY
INSERT INTO t2 VALUES (3,3,'a','x'), (3,3,'a','x');
INSERT INTO t2 VALUES (3,3,'a','x');
COMMIT;
set debug_sync= 'now SIGNAL go_alter';

Comment by Nikita Malyavin [ 2023-09-19 ]

Please review branch bb-11.2-nikita, commit b96b093d

Comment by Nikita Malyavin [ 2023-09-23 ]

There is a lot of interference with binlog_hton enabled by online alter, which I couldn't fight by adhoc state deductions. So I decided to carry online alter out, into a separate handlerton.

This gave me a possibility to store connection-local online alter data as ha_data.
Also I reworked the savepoints handling as handlerton functions.
Please review the following commits:

217a7a7c online alter: rework savepoints
ad6a305d online alter: use thd->ha_data to store cache_list
132c56f1 online alter: extract the source to a separate file
a71b1343 MDEV-32126 Assertion fails upon online ALTER and binary log enabled

Also please consider my 132c56f1 commit among then, which creates online_alter.cc and moves all the producer side code there. It's optional, but really helps reading and editing.

Comment by Sergei Golubchik [ 2023-09-25 ]

The overall problem is a deep interference with the effect of an installed
binlog_hton.

this doesn't really explain anything. Please explain why the assert actually fires.

You can explain here to avoid recommitting everything just to change a comment. Later, when everything is "ok to push" you can fix the comment in commit too together with other changes, if any

Comment by Nikita Malyavin [ 2023-09-25 ]

The high-level reason is that the assumption about that thd->binlog_get_cache_mngr() is, sufficiently, NULL, when we shouldn't run the binlog part of binlog_commit/binlog_rollback, is wrong, i.e. this condition:

  binlog_cache_mngr *const cache_mngr= thd->binlog_get_cache_mngr();
 
  if (!cache_mngr)
  {
...
    DBUG_RETURN(0);
  }

is not enough.

I don't know whether cache_mgr is used for statement logging or not. Here we have a mixed logging, and that is, some operations may have no row-base logging. However, cache_mngr exists, which makes me think, that a statement logging is done differently, and cache_mngr only serves row-based logging, since trx_cache is also empty at the moment of check.

What's definite is that binlog_hton wouldn't have been set for SBR. It gets enabled during binlog_log_row_to_binlog, in THD::binlog_start_trans_and_stmt.

In particular, trx_cache.restore_prev_position() is called during binlog_rollback, when it actually shouldn't have to – bacause it's SBR and binlog_hton is not set:

0x000055f154ef0ac8 in reinit_io_cache (info=0x61b000060638, type=WRITE_CACHE, seek_offset=18446744073709551615, use_async_io=0 '\000', clear_cache=0 '\000') at /home/nik/mariadb/mysys/mf_iocache.c:483
483	    info->pos_in_file=seek_offset;
(gdb) bt
#0  0x000055f154ef0ac8 in reinit_io_cache (info=0x61b000060638, type=WRITE_CACHE, seek_offset=18446744073709551615, use_async_io=0 '\000', clear_cache=0 '\000') at /home/nik/mariadb/mysys/mf_iocache.c:483
#1  0x000055f152b2b072 in binlog_cache_data::truncate (this=0x61b000060638, pos=18446744073709551615, reset_cache=false) at /home/nik/mariadb/sql/log.cc:492
#2  0x000055f152b2341a in binlog_cache_data::restore_prev_position (this=0x61b000060638) at /home/nik/mariadb/sql/log.cc:364
#3  0x000055f152b1c2d3 in binlog_truncate_trx_cache (thd=0x62b0001c0218, cache_mngr=0x61b000060498, all=false) at /home/nik/mariadb/sql/log.cc:2109
#4  0x000055f152ac7c80 in binlog_rollback (hton=0x615000002118, thd=0x62b0001c0218, all=false) at /home/nik/mariadb/sql/log.cc:2535
#5  0x000055f152617f6e in ha_rollback_trans (thd=0x62b0001c0218, all=false) at /home/nik/mariadb/sql/handler.cc:2264
#6  0x000055f1537c4d75 in trans_rollback_stmt (thd=0x62b0001c0218) at /home/nik/mariadb/sql/transaction.cc:535
#7  0x000055f15313e5ab in mysql_execute_command (thd=0x62b0001c0218, is_called_from_prepared_stmt=false) at /home/nik/mariadb/sql/sql_parse.cc:5854
#8  0x000055f15311ccfb in mysql_parse (thd=0x62b0001c0218, rawbuf=0x629000316238 "INSERT INTO t2 VALUES (3,3,'a','x'), (3,3,'a','x')", length=50, parser_state=0x7f2f140497f0) at /home/nik/mariadb/sql/sql_parse.cc:7811
#9  0x000055f15311674e in dispatch_command (command=COM_QUERY, thd=0x62b0001c0218, packet=0x62900030c219 "INSERT INTO t2 VALUES (3,3,'a','x'), (3,3,'a','x')", packet_length=50, blocking=true) at /home/nik/mariadb/sql/sql_parse.cc:1893
#10 0x000055f15311eb2a in do_command (thd=0x62b0001c0218, blocking=true) at /home/nik/mariadb/sql/sql_parse.cc:1406

I tried a few approaches to fix that problem. I tried to only exit when trx_cache->before_stmt_pos is MY_OFF_T_UNDEF, but then DBUG_ASSERT(WSREP(thd) || rollback_online) breaks, meaning, that sometimes it's fine to have undefined trx_cache. But in that case, binlog_commit/binlog_rollback is supposed to work somehow – since the handlerton was installed for some reason.

I tried to go deeper and take MY_OFF_T_UNDEF more carefully into account – actually, I tried to modify binlog_cache_data::empty() and write a more careful exit condition. That failed tests in random places, like it had an assertion failure in MYSQL_BIN_LOG::trx_group_commit_leader.

So overall this demonstrates that binlog state consistency is quite fragile, and I don't want to play around with it. I had a success in extracting everything in a separate hton with no much price, but some benefits, like online_alter_cache_list is moved out of thd, and I call it debloating (yes, first I bloated it ). And what's more important is more independency from binlog, which will definitely come quite handy.

Comment by Nikita Malyavin [ 2023-10-11 ]

Minor issues are resolved, head is now 8d21bf9f5. Waiting for the final verdict.

Comment by Sergei Golubchik [ 2023-10-13 ]

8d21bf9f5 2e38c6cb8d8c (new commit with GPL headers) is ok to push, thanks!

Generated at Thu Feb 08 10:29:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.