|
Is the issue perhaps simple to understand or deduce? For example; server shutdown -> InnoDB attempted further work -> where such work was unexpected and unhandled by binlog code, or similar?
/*
|
There should be no pending XIDs at shutdown, and only one entry (for
|
the active binlog file) in the list.
|
*/
|
DBUG_ASSERT(b->xid_count == 0);
|
|
|
The assert may relate to MDEV-24302, MDEV-24526.
Roel: It'd be helpful to learn about the load specifics. Could you please provide that somehow?
|
|
Here is what I know so far;
1) The issue can be reproduced using a single thread to the server
2) None of the error logs I checked (24) have a crashing Query noted for this bug
3) The issue is hard to reproduce
4) The issue also exists on 10.5.6 build 24/9/20
5) We have an `rr` trace which reproduces the issue. We also have multiple core dumps. Both can be analyzed further
I can help you or one of your team members to get setup with access to rr trace (like gdb) and core dump.
Please let me know if you have any other questions.
|
|
I have a very rough testcase with only 21 lines. I am confident that this testcases could reproduce the said issue at one point in time, against 10.5.6, build 24/9/20, even though I cannot get it to reproduce even a single time now, even against the same build. The testcase may have many bits with are unnecessary/superfluous. For sure, not all of the mysqld options are required; likely less then 2-5 are required, but it not clear which ones. Note that we have a SHUTDOWN just before the creation of an InnoDB table, which indeed seems to correspond much with what we know of this bug so far. Feel free to play around with the testcase. Please ignore ";#...." messages after queries; whilst they are relevant as to the original/first occurrence of the bug, the reduction may have rendered them incorrect and current output can look different due to a different buildup sequence. The only reason I left them is that I did not want to make any changes to a known-to-work (at least at one point) testcase. 100k-500k repeats, replay via pquery or other C-connector based clients may help. Single SQL thread is sufficient. The reduction was done using the pquery client. Also interesting is the binlog_size change, though it is not established that this is necessary for this particular bug. Sequential replay should be sufficient, though random order replay may be required. Hopefully this will shed some light. Elkin fyi.
# mysqld options used: --log-bin --sql_mode=ONLY_FULL_GROUP_BY --net_read_timeout=30 --net_write_timeout=60 --loose-table_lock_wait_timeout=50 --wait_timeout=28800 --lock-wait-timeout=86400 --innodb-lock-wait-timeout=50 --log_output=FILE --log_bin_trust_function_creators=1 --loose-max-statement-time=30 --loose-debug_assert_on_not_freed_memory=0 --innodb-buffer-pool-size=256M --innodb_use_native_aio=0 --max_allowed_packet=33554432 --maximum-bulk_insert_buffer_size=1M --maximum-join_buffer_size=1M --maximum-max_heap_table_size=1M --maximum-max_join_size=1M --maximum-myisam_max_sort_file_size=1M --maximum-myisam_mmap_size=1M --maximum-myisam_sort_buffer_size=1M --maximum-optimizer_trace_max_mem_size=1M --maximum-preload_buffer_size=1M --maximum-query_alloc_block_size=1M --maximum-query_prealloc_size=1M --maximum-range_alloc_block_size=1M --maximum-read_buffer_size=1M --maximum-read_rnd_buffer_size=1M --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --innodb_file_per_table=1 --innodb_flush_method=O_DIRECT --innodb_lock_schedule_algorithm=fcfs --innodb_stats_persistent=off --loose-idle_write_transaction_timeout=0 --loose-idle_transaction_timeout=0 --loose-idle_readonly_transaction_timeout=0 --connect_timeout=60 --interactive_timeout=28800
|
DROP DATABASE test;
|
CREATE DATABASE test;
|
USE test;
|
CREATE TABLE t (a CHAR(65));#ERROR: 1100 - Table 't' was not locked with LOCK TABLES
|
create table t1 (c1 char(10), c2 char(10)) engine=RocksDB;#ERROR: 1100 - Table 't1' was not locked with LOCK TABLES
|
CREATE TEMPORARY TABLE tmp_myisam_394 (a VARCHAR(256)) ENGINE=MyISAM;#NOERROR
|
create table t2 (c1 int);#ERROR: 1100 - Table 't2' was not locked with LOCK TABLES
|
insert into t2 values (12772+0.33);#ERROR: 1100 - Table 't2' was not locked with LOCK TABLES
|
create temporary table t1 (a int key) engine=MEMORY;#NOERROR
|
insert into t2 values (37591);#ERROR: 1100 - Table 't2' was not locked with LOCK TABLES
|
drop database if exists db_drop3;#ERROR: 1192 - Can't execute the given command because you have active locked tables or an active transaction
|
DROP PROCEDURE IF EXISTS p1;#ERROR: 1192 - Can't execute the given command because you have active locked tables or an active transaction
|
insert into t2 values (3871);#ERROR: 1100 - Table 't2' was not locked with LOCK TABLES
|
DROP TABLES t1, t2;#ERROR: 1051 - Unknown table 'test.t2'
|
INSERT INTO t1 VALUES(-8388608,0),(0,0),(8388607,16777215),('-8388608','0'),('8388607','16777215'),(-8388608.0,0.0),(8388607.0,16777215.0);#NOERROR
|
SET @@global.max_binlog_size = 4096;#NOERROR
|
insert into t1 values (6719,'6719');#ERROR: 1136 - Column count doesn't match value count at row 1
|
CREATE TEMPORARY TABLE t1 (c1 INT, INDEX(c1)) ENGINE=MRG_RocksDB UNION=(t1,t2);#ERROR: 1050 - Tabela 't1' ve\0107 postoji
|
shutdown;#NOERROR
|
CREATE TABLE t1 (id MEDIUMINT NOT NULL AUTO_INCREMENT PRIMARY KEY) ENGINE=innodb;#ERROR: 2006 - MySQL server has gone away
|
A cleaned up version for readability, but this testcase was never proven to produce the bug even once, whereas the one above was;
# mysqld options required for replay: --log-bin --sql_mode=ONLY_FULL_GROUP_BY --net_read_timeout=30 --net_write_timeout=60 --loose-table_lock_wait_timeout=50 --wait_timeout=28800 --lock-wait-timeout=86400 --innodb-lock-wait-timeout=50 --log_output=FILE --log_bin_trust_function_creators=1 --loose-max-statement-time=30 --loose-debug_assert_on_not_freed_memory=0 --innodb-buffer-pool-size=256M --innodb_use_native_aio=0 --max_allowed_packet=33554432 --maximum-bulk_insert_buffer_size=1M --maximum-join_buffer_size=1M --maximum-max_heap_table_size=1M --maximum-max_join_size=1M --maximum-myisam_max_sort_file_size=1M --maximum-myisam_mmap_size=1M --maximum-myisam_sort_buffer_size=1M --maximum-optimizer_trace_max_mem_size=1M --maximum-preload_buffer_size=1M --maximum-query_alloc_block_size=1M --maximum-query_prealloc_size=1M --maximum-range_alloc_block_size=1M --maximum-read_buffer_size=1M --maximum-read_rnd_buffer_size=1M --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --innodb_file_per_table=1 --innodb_flush_method=O_DIRECT --innodb_lock_schedule_algorithm=fcfs --innodb_stats_persistent=off --loose-idle_write_transaction_timeout=0 --loose-idle_transaction_timeout=0 --loose-idle_readonly_transaction_timeout=0 --connect_timeout=60 --interactive_timeout=28800
|
DROP DATABASE test;
|
CREATE DATABASE test;
|
USE test;
|
CREATE TABLE t (a CHAR(65));
|
CREATE TABLE t1 (c1 CHAR(10), c2 CHAR(10)) ENGINE=RocksDB;
|
CREATE TEMPORARY TABLE tmp_myisam_394 (a VARCHAR(256)) ENGINE=MyISAM;
|
CREATE TABLE t2 (c1 INT);
|
INSERT INTO t2 VALUES (12772+0.33);
|
CREATE TEMPORARY TABLE t1 (a INT KEY) ENGINE=MEMORY;
|
INSERT INTO t2 VALUES (37591);
|
DROP DATABASE IF EXISTS db_drop3;
|
DROP PROCEDURE IF EXISTS p1;
|
INSERT INTO t2 VALUES (3871);
|
DROP TABLES t1, t2;
|
INSERT INTO t1 VALUES (-8388608,0), (0,0), (8388607,16777215), ('-8388608','0'), ('8388607','16777215'), (-8388608.0,0.0), (8388607.0,16777215.0);
|
SET GLOBAL max_binlog_size = 4096;
|
INSERT INTO t1 VALUES (6719,'6719');
|
CREATE TEMPORARY TABLE t1 (c1 INT, INDEX (c1)) ENGINE=mrg_rocksdb UNION= (t1,t2);
|
SHUTDOWN;
|
CREATE TABLE t1 (id MEDIUMINT NOT NULL AUTO_INCREMENT PRIMARY KEY) ENGINE=InnoDB;
|
|
|
Another testcase which I was able to reduce to 28 lines. Non-cleaned up version which was sure to produce the issue at least once:
# mysqld options required for replay: --max_allowed_packet=33554432 --maximum-bulk_insert_buffer_size=1M --maximum-join_buffer_size=1M --maximum-max_heap_table_size=1M --maximum-max_join_size=1M --maximum-myisam_max_sort_file_size=1M --maximum-myisam_mmap_size=1M --maximum-myisam_sort_buffer_size=1M --maximum-optimizer_trace_max_mem_size=1M --maximum-preload_buffer_size=1M --maximum-query_alloc_block_size=1M --maximum-query_prealloc_size=1M --maximum-range_alloc_block_size=1M --maximum-read_buffer_size=1M --maximum-read_rnd_buffer_size=1M --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --sql_mode=ONLY_FULL_GROUP_BY --innodb_file_per_table=1 --innodb_flush_method=O_DIRECT --innodb_stats_persistent=off --loose-idle_write_transaction_timeout=0 --loose-idle_transaction_timeout=0 --loose-idle_readonly_transaction_timeout=0 --connect_timeout=60 --interactive_timeout=28800 --slave_net_timeout=60 --net_read_timeout=30 --net_write_timeout=60 --loose-table_lock_wait_timeout=50 --wait_timeout=28800 --lock-wait-timeout=86400 --innodb-lock-wait-timeout=50 --log_output=FILE --log-bin --log_bin_trust_function_creators=1 --loose-max-statement-time=30 --loose-debug_assert_on_not_freed_memory=0 --innodb-buffer-pool-size=300M
|
CREATE DATABASE transforms;
|
DROP DATABASE test;
|
CREATE DATABASE test;
|
USE test;
|
SET @@session.tx_isolation= 'READ-COMMITTED';#ERROR: 1231 - Variable 'tx_isolation' can't be set to the value of 'READ-COMMITTED'
|
CREATE TABLE t1 (id int PRIMARY KEY, b char(3), INDEX(b));#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
CREATE TABLE ti (a TINYINT UNSIGNED, b BIGINT UNSIGNED, c BINARY(6) NOT NULL, d VARBINARY(99) NOT NULL, e VARBINARY(54) NOT NULL, f VARBINARY(77) NOT NULL, g LONGBLOB NOT NULL, h BLOB NOT NULL, id BIGINT NOT NULL, KEY(b), KEY(e), PRIMARY KEY(id)) ENGINE=RocksDB;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
CREATE TEMPORARY TABLE t1 (a INT, b INT);#ERROR: 1050 - Tabela 't1' već postoji
|
set global max_binlog_size=100;#ERROR: 1231 - Variable 'max_binlog_size' can't be set to the value of '100'
|
INSERT INTO ti VALUES (161,11484719563904639465,'9n9dSoKcYLJzJPeyPEmmRiqNSrYlV0EjP','t4u2XPQ4KW6BJLaMayzsGESjAy8vrEU4Ja','wtQbaDgPzbqJivpDBQLjBxDXptqExVW8rMS5r2TNrqgVILShXgY1pT837cJt','SyN2O98mFGxkxvvF8MT8LzsPG8V5GmGIdD1qIYsIZKHi039gpejoWdoZ6Wk1JvBOwrwNxwER1Um3GhiJKgMEoEN6cyhuvD79wEgp8SBuG','d','cXE',6);#ERROR: 1264 - Out of range value for column 'b' at row 1
|
CREATE TABLE test_repair_table ( val integer not null ) ENGINE = CSV;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
RENAME TABLE t1 TO tt1tt1;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
INSERT INTO ti VALUES (569784834,17686,'lT79eTvuYN0CBnamoxazOjEtuEZ8R2p','eHbtzcUVbWl7NUfruAg40Vi2WxKQ4mfhjYMoY2uLy8QuXDgNeNppunOBQX00EE7BecTDJklwodwWp1tc8TvLHhdyyP9wmwOFGlZzgyp2wS6uBjEiW0NrBo01tUO91EL1v0xwvGOBtXzx4Ed2nBnoSZplnHsa8Ikfm2Kohuv62iX8c0wJSr3qeSEKD9jvHH8QW5RDIytCRnd','EZ3EXKSQ48vGowMUpQSVzz723HG','ZLuDwRZb7v17gTSa3yvEVJiWYPwmOI8ZM4Selb7f6zagtMyc5N8NQ4N489uqRyHCz76dOeva00z7WTCkDvcPPmQBtB9ibZu4GJRdcQKvtOe59tLTOTIBav8clCtzLvKGbqPkJl847Aj3UpbguYOwjO3uS7gz7Tm8UtsP3Gr5pnOu7TdvwGY7lmDrRVZM0amM45FsCQe5PRN','c','S',4);#ERROR: 1264 - Out of range value for column 'a' at row 1
|
CREATE DATABASE RocksDB_test;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
CREATE FUNCTION bug48766 () RETURNS ENUM('а','б','в','г') CHARACTER SET ucs2 RETURN 0;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
DROP PROCEDURE IF EXISTS spexecute11;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
CREATE USER test_2@localhost;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
INSERT INTO ti VALUES (1894749719158306246,-2519559641696011780,'Ab75','hxHCQ8ESQs1Y9pOxc1DRm6bIeyR5scaCBSoIP0PzTtr7v6NFVvhdO9MbVUyT9l7gSdf8maFEhwF8YBD5PQ1TMn4PLxucQ4l7pPJiFgV7ByQT1k3ICNU36sBCqzsMu','6K98fqdjH2HNGWSaFGugTiFunW0txxTHoFRbDEIkXy3','17G3zliWOzQXA4JXM','0','B',3);#ERROR: 1264 - Out of range value for column 'a' at row 1
|
CREATE TABLE TokuDB (a INT) ENGINE = TokuDB;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
DROP TABLE IF EXISTS `t2`;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
CREATE TABLE t10(c1 INT, c2 char(20)) ENGINE = InnoDB;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
CREATE TABLE IF NOT EXISTS `龗龗龗`(`丄丄丄` char(1)) DEFAULT CHARSET = utf8 engine=RocksDB;#ERROR: 1300 - Invalid big5 character string: '\xE9\xBE\x97\xE9\xBE\x97\xE9\xBE\x97'
|
grant select on RocksDBtest.* to RocksDBtest_1@localhost;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
drop table if exists `T9a`;#ERROR: 1300 - Invalid big5 character string: '\xEF\xBC\xB4\xEF\xBC\x99a'
|
replace INTO t1 values ('cc', '-cc');#ERROR: 1136 - Broj kolona ne odgovara broju vrednosti u slogu 1
|
DROP PROCEDURE IF EXISTS p1;#ERROR: 1399 - XAER_RMFAIL: The command cannot be executed when global transaction is in the ACTIVE state
|
SELECT SLEEP(3);
|
A cleaned up version for readability, but this testcase was never proven to produce the bug even once, whereas the one above was;
# mysqld options required for replay: --max_allowed_packet=33554432 --maximum-bulk_insert_buffer_size=1M --maximum-join_buffer_size=1M --maximum-max_heap_table_size=1M --maximum-max_join_size=1M --maximum-myisam_max_sort_file_size=1M --maximum-myisam_mmap_size=1M --maximum-myisam_sort_buffer_size=1M --maximum-optimizer_trace_max_mem_size=1M --maximum-preload_buffer_size=1M --maximum-query_alloc_block_size=1M --maximum-query_prealloc_size=1M --maximum-range_alloc_block_size=1M --maximum-read_buffer_size=1M --maximum-read_rnd_buffer_size=1M --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --sql_mode=ONLY_FULL_GROUP_BY --innodb_file_per_table=1 --innodb_flush_method=O_DIRECT --innodb_stats_persistent=off --loose-idle_write_transaction_timeout=0 --loose-idle_transaction_timeout=0 --loose-idle_readonly_transaction_timeout=0 --connect_timeout=60 --interactive_timeout=28800 --slave_net_timeout=60 --net_read_timeout=30 --net_write_timeout=60 --loose-table_lock_wait_timeout=50 --wait_timeout=28800 --lock-wait-timeout=86400 --innodb-lock-wait-timeout=50 --log_output=FILE --log-bin --log_bin_trust_function_creators=1 --loose-max-statement-time=30 --loose-debug_assert_on_not_freed_memory=0 --innodb-buffer-pool-size=300M
|
CREATE DATABASE transforms;
|
DROP DATABASE test;
|
CREATE DATABASE test;
|
USE test;
|
SET SESSION tx_isolation= 'READ-COMMITTED';
|
CREATE TABLE t1 (id INT PRIMARY KEY, b CHAR(3), INDEX (b));
|
CREATE TABLE ti (a TINYINT UNSIGNED, b BIGINT UNSIGNED, c BINARY (6) NOT NULL, d VARBINARY (99) NOT NULL, e VARBINARY (54) NOT NULL, f VARBINARY (77) NOT NULL, g LONGBLOB NOT NULL, h BLOB NOT NULL, id BIGINT NOT NULL, KEY(b), KEY(e), PRIMARY KEY(id)) ENGINE=RocksDB;
|
CREATE TEMPORARY TABLE t1 (a INT, b INT);
|
SET GLOBAL max_binlog_size=100;
|
INSERT INTO ti VALUES(161,11484719563904639465,'9n9dSoKcYLJzJPeyPEmmRiqNSrYlV0EjP','t4u2XPQ4KW6BJLaMayzsGESjAy8vrEU4Ja','wtQbaDgPzbqJivpDBQLjBxDXptqExVW8rMS5r2TNrqgVILShXgY1pT837cJt','SyN2O98mFGxkxvvF8MT8LzsPG8V5GmGIdD1qIYsIZKHi039gpejoWdoZ6Wk1JvBOwrwNxwER1Um3GhiJKgMEoEN6cyhuvD79wEgp8SBuG','d','cXE',6);
|
CREATE TABLE test_repair_table (val INT NOT NULL) ENGINE=CSV;
|
RENAME TABLE t1 TO tt1tt1;
|
INSERT INTO ti VALUES(569784834,17686,'lT79eTvuYN0CBnamoXAzOjEtuEZ8R2p','eHbtzcUVbWl7NUfruAg40Vi2WxKQ4mfhjYMoY2uLy8QuXDgNeNppunOBQX00EE7BecTDJklwodwWp1tc8TvLHhdyyP9wmwOFGlZzgyp2wS6uBjEiW0NrBo01tUO91EL1v0xwvGOBtXzx4Ed2nBnoSZplnHsa8Ikfm2Kohuv62iX8c0wJSr3qeSEKD9jvHH8QW5RDIytCRnd','EZ3EXKSQ48vGowMUpQSVzz723HG','ZLuDwRZb7v17gTSa3yvEVJiWYPwmOI8ZM4Selb7f6zagtMyc5N8NQ4N489uqRyHCz76dOeva00z7WTCkDvcPPmQBtB9ibZu4GJRdcQKvtOe59tLTOTIBav8clCtzLvKGbqPkJl847Aj3UpbguYOwjO3uS7gz7Tm8UtsP3Gr5pnOu7TdvwGY7lmDrRVZM0amM45FsCQe5PRN','c','S',4);
|
CREATE DATABASE rocksdb_test;
|
CREATE FUNCTION bug48766() RETURNS ENUM ('а','б','в','г') CHARACTER SET ucs2 RETURN 0;
|
DROP PROCEDURE IF EXISTS spexecute11;
|
CREATE USER test_2@localhost;
|
INSERT INTO ti VALUES(1894749719158306246,-2519559641696011780,'Ab75','hxHCQ8ESQs1Y9pOxc1DRm6bIeyR5scaCBSoIP0PzTtr7v6NFVvhdO9MbVUyT9l7gSdf8maFEhwF8YBD5PQ1TMn4PLxucQ4l7pPJiFgV7ByQT1k3ICNU36sBCqzsMu','6K98fqdjH2HNGWSaFGugTiFunW0txxTHoFRbDEIkXy3','17G3zliWOzQXA4JXM','0','B',3);
|
CREATE TABLE TokuDB (a INT) ENGINE=TokuDB;
|
DROP TABLE IF EXISTS t2;
|
CREATE TABLE t10 (c1 INT, c2 CHAR(20)) ENGINE=InnoDB;
|
CREATE TABLE IF NOT EXISTS 龗龗龗 (丄丄丄 CHAR(1)) DEFAULT CHARSET=utf8 ENGINE=RocksDB;
|
GRANT SELECT ON RocksDBtest.* TO rocksdbtest_1@localhost;
|
DROP TABLE IF EXISTS T9a;
|
REPLACE INTO t1 VALUES('cc', '-cc');
|
DROP PROCEDURE IF EXISTS p1;
|
SELECT SLEEP (3);
|
|
|
We now have 8 rr traces + cores for this bug.
|
|
Roel, howdy. I am refreshing my understanding of the case and apparently I need to explore the rr traces. Before to jump to that, I could not understand refinements of how-to-reproduce that you left in the last comments.
The description shows a stack with MYSQL_BIN_LOG::cleanup which is caused by mysql> shutdown;. I don't see the latter in the above https://jira.mariadb.org/browse/MDEV-24660?focusedCommentId=179811&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-179811 .
Could you please explain?
Cheers, Andrei.
|
|
Roel,
First, still to the 2nd of https://jira.mariadb.org/browse/MDEV-24660?focusedCommentId=179811&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-179811 paste,
it just can not cause the stack of the description, which runs:
#4 0x0000562770aedc77 in MYSQL_BIN_LOG::cleanup (this=0x562771cd43c0 <mysql_bin_log>)
|
|
at /test/10.6_dbg/sql/log.cc:3449
|
|
#5 0x00005627705a2428 in clean_up (print_message=print_message@entry=true) at /test/10.6_dbg/sql/mysqld.cc:1962
|
|
So I feel it must've meant something else than the description stack happened to you..
Secondly, I have not gotten yet the rr server access. I remember Marko e-mailed that, I should find.
But before I do, I'd suggest we wait for MDEV-24302 fixes be tested to try again with MDEV-24302 fixes pushed to 10.5 to reproduce the description issue.
There's a good chance that issue was related.
Could you please arrange that for one last time?
|
|
OK, after a lot of work on the testcase of another occurrence I have finally got more to report on this one.
Firstly, the only option which seems to be required is --log-bin.
Secondly, this testcase can reproduce the issue using reducer as a base run tool + pquery (in other words; a C/C++ based client connection seems to be required and potentially this may not replay from the CLI):
# mysqld options required for replay: --log-bin
|
CREATE DATABASE transforms;
|
DROP DATABASE test;
|
CREATE DATABASE test;
|
USE test;
|
SET sql_mode='';
|
CREATE TEMPORARY TABLE `���`(`���` char(1)) DEFAULT CHARSET=sjis engine=InnoDB;#NOERROR
|
delete from mysql.user where user='user1' or user='user2';#NOERROR
|
CREATE TABLE t1(c1 char(1)) DEFAULT CHARSET=ujis ENGINE=InnoDB;#ERROR: 1050 - Table 't1' already exists
|
INSERT INTO t1 VALUES(17509);#NOERROR
|
DELETE FROM t1;#NOERROR
|
create table t2 (a int auto_increment,b int,PRIMARY KEY (a)) ENGINE=InnoDB;#NOERROR
|
INSERT INTO t1 VALUES (-123.456e0);#NOERROR
|
LOAD DATA INFILE '../../tmp/proc.txt' INTO TABLE InnoDB.proc;#ERROR: 1100 - Table''was not locked with LOCK TABLES
|
CREATE TABLE `アアア`(`カカカ` char(1)) DEFAULT CHARSET=utf8 engine=InnoDB;#ERROR: 1100 - Table 'アアア' was not locked with LOCK TABLES
|
INSERT INTO t1 VALUES(0xADBF);#ERROR: 1792 - Cannot execute statement in a READ ONLY transaction
|
CREATE TABLE ti (a SMALLINT UNSIGNED,b BIGINT UNSIGNED ,c CHAR(32) ,d VARCHAR(31) ,e VARCHAR(31),f VARBINARY(58),g TINYBLOB,h BLOB ,id BIGINT ,KEY(b),KEY(e),PRIMARY KEY(id)) ENGINE=InnoDB;#ERROR: 1792 - Cannot execute statement in a READ ONLY transaction
|
insert into t1 values (-1),(-2),(-3);#ERROR: 1792 - Cannot execute statement in a READ ONLY transaction
|
INSERT INTO ti VALUES (4023595403684610118,7836,'7swVX','xWYjZYSkX0P','JJgE','so18BIYlJMt8r05JqWP3Q7e5i8xMnZyp','1','7',11);#ERROR: 1792 - Cannot execute statement in a READ ONLY transaction
|
SET @@global.max_binlog_size=4095;#NOERROR
|
shutdown;#NOERROR
|
The testcase is highly sporadic and requires continuous startup/shutdowns of mysqld.
Thirdly, any attempts at removing the # comments (which are no longer correct; they are from the original run) have failed thus far. As such, there is a suspicion that "Exact length" is required, and I note the max_binlog_size=4095, present in other ways in the other testcases also. This may be an 'off by one' issue or similar.
Given that there are rr traces available (and additional ones can be produced at will), there would seem to be sufficient information for this bug to be analyzed further by Elkin.
|
|
Also uploaded that last testcase as MDEV-24660.sql to ensure the special chars are captured correctly.
|
|
Also seen in 10.10, regularly.
mysqld: /test/10.10_dbg/sql/log.cc:3557: void MYSQL_BIN_LOG::cleanup(): Assertion `b->xid_count == 0' failed.
|
Core was generated by `/test/MD310522-mariadb-10.10.0-linux-x86_64-dbg/bin/mysqld --no-defaults --max_'.
|
Program terminated with signal SIGABRT, Aborted.
|
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
[Current thread is 1 (Thread 0x150584635800 (LWP 4172928))]
|
(gdb) bt
|
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
#1 0x000015058480e859 in __GI_abort () at abort.c:79
|
#2 0x000015058480e729 in __assert_fail_base (fmt=0x1505849a4588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5585ea58250c "b->xid_count == 0", file=0x5585ea581fe4 "/test/10.10_dbg/sql/log.cc", line=3557, function=<optimized out>) at assert.c:92
|
#3 0x000015058481ffd6 in __GI___assert_fail (assertion=assertion@entry=0x5585ea58250c "b->xid_count == 0", file=file@entry=0x5585ea581fe4 "/test/10.10_dbg/sql/log.cc", line=line@entry=3557, function=function@entry=0x5585ea58251e "void MYSQL_BIN_LOG::cleanup()") at assert.c:101
|
#4 0x00005585e9c2955e in MYSQL_BIN_LOG::cleanup (this=0x5585eaceb660 <mysql_bin_log>) at /test/10.10_dbg/sql/log.cc:3557
|
#5 0x00005585e96ab259 in clean_up (print_message=print_message@entry=true) at /test/10.10_dbg/sql/mysqld.cc:1968
|
#6 0x00005585e96b766c in mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/10.10_dbg/sql/mysqld.cc:5937
|
#7 0x00005585e96aab66 in main (argc=<optimized out>, argv=<optimized out>) at /test/10.10_dbg/sql/main.cc:34
|
|
|
The issue is always seen at shudown only.
|
10.10.0 081a284712bb661349e2e3802077b12211cede3e (Debug)
|
2022-06-12 18:46:57 0 [Note] /test/MD310522-mariadb-10.10.0-linux-x86_64-dbg/bin/mysqld (initiated by: root[root] @ localhost []): Normal shutdown
|
2022-06-12 18:46:57 0 [Note] Event Scheduler: Killing the scheduler thread, thread id 23
|
2022-06-12 18:46:57 0 [Note] Event Scheduler: Waiting for the scheduler thread to reply
|
2022-06-12 18:46:57 0 [Note] Event Scheduler: Stopped
|
2022-06-12 18:46:57 18 [Note] Stopping ack receiver thread
|
2022-06-12 18:46:57 0 [Note] InnoDB: FTS optimize thread exiting.
|
mysqld: /test/10.10_dbg/sql/log.cc:3557: void MYSQL_BIN_LOG::cleanup(): Assertion `b->xid_count == 0' failed.
|
|
|
New RR traces of the shutdown hang issues (using RESET MASTER) produced on latest 10.6.9 source and handed to Elkin.
With many thanks to @marko for his ever-helpful assistance with rr and related matters.
|
|
Roel, as a summary of my analysis of rr traces available, the /data/934191/1/rr/mysqld-0, 9a08fcbf60567 trace deals with a stale version. We need the latest 10.5 to be checked as it contains bb-10.5-MDEV-24302 two commits. It's also fine if it's easier to work on 10.10 that your latest comments refer to.
The 2nd /data/320105/1/rr/mysqld-0 trace does not represent a hanging server. The final state prior to the server kill does not show any RESET-MASTER nor SHUTDOWN.
|
|
Elkin Great, thank you for the analysis. Ack on 1st trace, and for 2nd I am surprised as msqladmin shutdown was definitely hanging for 90 seconds before SIGKILL was sent to mysqladmin shutdown (whilst mysqld kept running) and then SIGABRT was sent to mysqld which finalized/rounded up the rr trace. It may be that the server was in some other state than RESET MASTER or SHUTDOWN, however it would seem that something was definitely hanging. I will next generate additional traces also attempting 10.10 as discussed. I may also see if somehow I can reproduce the assert abort rather than the RESET MASTER hang.
|
|
Status update: reproduced the assert b->xid_count == 0 under rr in 10.10, handed to Andrei, who found the cause, and is creating a patch.
|
|
The patch created by Elkin resolves both the b->xid_count == 0 assert, as well as the RESET MASTER hang on shutdown.
|
|
Salve, Brandon.
Could you please check the commit at
5bf4dee3694..bca750f81fb HEAD -> bb-10.5-andrei
Cheers,
Andrei
|
|
Looks good to me! Thanks, Elkin!
|
|
For merging to 10.6 notice bb-10.6-andrei-MDEV-24660.
|