[MDEV-32249] strings/ctype-ucs2.c:2336: my_vsnprintf_utf32: Assertion `(n % 4) == 0' failed in my_vsnprintf_utf32 on INSERT Created: 2023-09-26  Updated: 2023-10-11  Resolved: 2023-10-11

Status: Closed
Project: MariaDB Server
Component/s: Character Sets, Replication
Affects Version/s: 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10, 10.11, 11.0, 11.1, 11.2, 11.3, 11.3.0
Fix Version/s: 10.8.8, 10.4.32, 10.5.23, 10.6.16, 10.9.8, 10.10.7, 10.11.6, 11.0.4, 11.1.3, 11.2.2, 11.3.1

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Alexander Barkov
Resolution: Fixed Votes: 0
Labels: None


 Description   

This seems to be a recent regression:
The bug does not reproduce on 10.6 of 11 Sep at 961b96a5e0dd40512b8fff77dcec273187ccc9fd but does reproduce at 10.6 of 26 Sep at d13a57ae8181f2a8fbee86838d5476740e050d50
When the following SQL is executed in a standard master/slave setup:

CREATE TABLE t (c VARCHAR(1) CHARACTER SET utf32 COLLATE utf32_turkish_ci) ENGINE=MyISAM;
CALL sys.statement_performance_analyzer (1,1,1);
DROP TABLE t;
CREATE TABLE t (c INT) ENGINE=MyISAM;
SET sql_log_bin=1;
INSERT INTO t VALUES (1);

The slave (debug build) will assert:

10.6.16 d13a57ae8181f2a8fbee86838d5476740e050d50 (Debug)

2023-09-26 15:36:59 5 [Note] Slave I/O thread: Start asynchronous replication to master 'repl_user@127.0.0.1:12208' in log '' at position 4
2023-09-26 15:36:59 5 [Note] Slave I/O thread: connected to master 'repl_user@127.0.0.1:12208',replication starts at GTID position ''
2023-09-26 15:36:59 6 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 4, relay log './relaylog.000001' position: 4; GTID position ''
mariadbd: /test/10.6_dbg/strings/ctype-ucs2.c:2336: my_vsnprintf_utf32: Assertion `(n % 4) == 0' failed.

10.6.16 d13a57ae8181f2a8fbee86838d5476740e050d50 (Debug)

Core was generated by `/test/BASE_MD260923-mariadb-10.6.16-linux-x86_64-dbg/bin/mariadbd --no-defaults'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=22541758277184)
    at ./nptl/pthread_kill.c:44
[Current thread is 1 (Thread 0x1480697ea640 (LWP 1701910))]
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=22541758277184) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=22541758277184) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=22541758277184, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x000014808f042476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x000014808f0287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x000014808f02871b in __assert_fail_base (fmt=0x14808f1dd150 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x563fb08bd6aa "(n % 4) == 0", file=0x563fb08bd5e0 "/test/10.6_dbg/strings/ctype-ucs2.c", line=2336, function=<optimized out>) at ./assert/assert.c:92
#6  0x000014808f039e96 in __GI___assert_fail (assertion=assertion@entry=0x563fb08bd6aa "(n % 4) == 0", file=file@entry=0x563fb08bd5e0 "/test/10.6_dbg/strings/ctype-ucs2.c", line=line@entry=2336, function=function@entry=0x563fb08bded0 <__PRETTY_FUNCTION__.40> "my_vsnprintf_utf32") at ./assert/assert.c:101
#7  0x0000563fb02834f0 in my_vsnprintf_utf32 (ap=0x1480697e9098, fmt=0x563fb0571df0 "varchar(%u octets) character set %s", n=<optimized out>, dst=<optimized out>) at /test/10.6_dbg/strings/ctype-ucs2.c:2336
#8  my_snprintf_utf32 (cs=cs@entry=0x563fb0ce4c60 <my_charset_utf32_turkish_uca_ci>, to=<optimized out>, n=<optimized out>, fmt=fmt@entry=0x563fb0571df0 "varchar(%u octets) character set %s") at /test/10.6_dbg/strings/ctype-ucs2.c:2428
#9  0x0000563fafa9dd32 in Field_varstring::sql_rpl_type (this=0x147f7c01f8c8, res=0x1480697e9530) at /test/10.6_dbg/sql/sql_string.h:359
#10 0x0000563faf94cc28 in table_def::compatible_with (this=this@entry=0x147f7c021be8, thd=0x147f7c000f98, rgi=rgi@entry=0x14803006c470, table=0x147f7c023cb8, conv_table_var=conv_table_var@entry=0x1480697e98d0) at /test/10.6_dbg/sql/rpl_utility_server.cc:999
#11 0x0000563fafc36685 in Rows_log_event::do_apply_event (this=0x14803006f998, rgi=0x14803006c470) at /test/10.6_dbg/sql/log_event_server.cc:5602
#12 0x0000563faf7443f4 in Log_event::apply_event (rgi=0x14803006c470, this=0x14803006f998) at /test/10.6_dbg/sql/log_event.h:1499
#13 apply_event_and_update_pos_apply (ev=ev@entry=0x14803006f998, thd=thd@entry=0x147f7c000f98, rgi=rgi@entry=0x14803006c470, reason=reason@entry=0) at /test/10.6_dbg/sql/slave.cc:3894
#14 0x0000563faf74e4df in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x14803006f998, thd=thd@entry=0x147f7c000f98, rgi=rgi@entry=0x14803006c470) at /test/10.6_dbg/sql/slave.cc:4090
#15 0x0000563faf9e3db0 in rpt_handle_event (qev=qev@entry=0x14803006fb18, rpt=rpt@entry=0x148030026d78) at /test/10.6_dbg/sql/rpl_parallel.cc:64
#16 0x0000563faf9e889a in handle_rpl_parallel_thread (arg=arg@entry=0x148030026d78) at /test/10.6_dbg/sql/rpl_parallel.cc:1512
#17 0x0000563fafdafc3e in pfs_spawn_thread (arg=0x148030038768) at /test/10.6_dbg/storage/perfschema/pfs.cc:2201
#18 0x000014808f094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#19 0x000014808f126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Attempts to replicate the bug on a standalone (non-replication) server have failed thus far.



 Comments   
Comment by Alexander Barkov [ 2023-09-27 ]

This code is wrong:

void Field_varstring::sql_rpl_type(String *res) const
{
  CHARSET_INFO *cs=charset();

We are printing the data type here, to send it to a log. So the character set of the Field itself does not matter at all, it should not be used.

From a glance, it should be changed to:

void Field_varstring::sql_rpl_type(String *res) const
{
  CHARSET_INFO *cs= &my_charset_latin1;

The same problem is in Field_string::sql_rpl_type().

It seems this code hasn't been changed recently.

It's interesting which change caused the regression.

Also, a reproducible MTR test is needed.

Comment by Roel Van de Paar [ 2023-09-27 ]

MTR Testcase

--source include/have_binlog_format_row.inc
--source include/master-slave.inc
--connection master
CREATE TABLE t (c VARCHAR(1) CHARACTER SET utf32 COLLATE utf32_turkish_ci) ENGINE=MyISAM;
--error ER_SIGNAL_EXCEPTION
CALL sys.statement_performance_analyzer (1,1,1);
DROP TABLE t;
CREATE TABLE t (c INT) ENGINE=MyISAM;
SET sql_log_bin=1;
INSERT INTO t VALUES (1);
DROP TABLE t;  # Cleanup
--source include/rpl_end.inc

Comment by Roel Van de Paar [ 2023-09-27 ]

10.6 Optimzed MTR (does not crash like debug builds): slave SQL thread failure:

10.6.16 d13a57ae8181f2a8fbee86838d5476740e050d50 (Optimized)

worker[01] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
include/master-slave.inc
[connection master]
connection master;
CREATE TABLE t (c VARCHAR(1) CHARACTER SET utf32 COLLATE utf32_turkish_ci) ENGINE=MyISAM;
CALL sys.statement_performance_analyzer (1,1,1);
summary
Disabled 1 thread
ERROR 45000: The snapshot action requires in_table to be NULL, NOW() or specify an existing table. The table `sys`.`1` does not exist.
DROP TABLE t;
CREATE TABLE t (c INT) ENGINE=MyISAM;
SET sql_log_bin=1;
INSERT INTO t VALUES (1);
DROP TABLE t;
include/rpl_end.inc
main.test 'row'                          [ fail ]
        Test ended at 2023-09-27 14:58:50
 
CURRENT_TEST: main.test
analyze: sync_with_master
mysqltest: In included file "./include/rpl_sync.inc": 
included from ./include/rpl_end.inc at line 80:
included from /test/BASE_MD260923-mariadb-10.6.16-linux-x86_64-opt/mysql-test/main/test.test at line 14:
At line 157: sync_slave_with_master failed: 'select master_pos_wait('master-bin.000001', 937, 300, '')' returned NULL indicating slave SQL thread failure
 
 
 == /test/BASE_MD260923-mariadb-10.6.16-linux-x86_64-opt/mysql-test/var/tmp/analyze-sync_with_master-mysqld.1.err ==
 
############################## default ##############################
 
**** SHOW WARNINGS on default ****
SHOW WARNINGS;
Level	Code	Message
 
**** SELECT replication-related variables on default ****
SELECT NOW(), @@SERVER_ID;
NOW()	@@SERVER_ID
2023-09-27 14:58:49	1
 
**** SHOW SLAVE STATUS on default ****
SHOW SLAVE STATUS;
 
**** SHOW MASTER STATUS on default ****
SHOW MASTER STATUS;
File	master-bin.000001
Position	937
Binlog_Do_DB	
Binlog_Ignore_DB	
 
**** SHOW SLAVE HOSTS on default ****
SHOW SLAVE HOSTS;
Server_id	2
Host	127.0.0.1
Port	16001
Master_id	1
 
**** SHOW PROCESSLIST on default ****
SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info	Progress
7	root	localhost:17356	NULL	Binlog Dump	0	Master has sent all binlog to slave; waiting for more updates	NULL	0.000
10	root	localhost	NULL	Query	0	starting	SHOW PROCESSLIST	0.000
 
**** SHOW BINARY LOGS on default ****
SHOW BINARY LOGS;
Log_name	File_size
master-bin.000001	937
 
**** SHOW BINLOG EVENTS on default ****
binlog_name = 'master-bin.000001'
SHOW BINLOG EVENTS IN 'master-bin.000001';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
master-bin.000001	4	Format_desc	1	256	Server ver: 10.6.16-MariaDB-log, Binlog ver: 4
master-bin.000001	256	Gtid_list	1	285	[]
master-bin.000001	285	Binlog_checkpoint	1	329	master-bin.000001
master-bin.000001	329	Gtid	1	371	GTID 0-1-1
master-bin.000001	371	Query	1	535	use `test`; CREATE TABLE t (c VARCHAR(1) CHARACTER SET utf32 COLLATE utf32_turkish_ci) ENGINE=MyISAM
master-bin.000001	535	Gtid	1	577	BEGIN GTID 0-1-2
master-bin.000001	577	Annotate_rows	1	624	INSERT INTO t VALUES (1)
master-bin.000001	624	Table_map	1	668	table_id: 38 (test.t)
master-bin.000001	668	Write_rows_v1	1	706	table_id: 38 flags: STMT_END_F
master-bin.000001	706	Query	1	779	COMMIT
master-bin.000001	779	Gtid	1	821	GTID 0-1-3
master-bin.000001	821	Query	1	937	use `test`; DROP TABLE `t` /* generated by server */
 
**** SHOW RELAYLOG EVENTS on default ****
relaylog_name = 'No such row'
SHOW RELAYLOG EVENTS IN 'No such row';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
connection default;
 
 == /test/BASE_MD260923-mariadb-10.6.16-linux-x86_64-opt/mysql-test/var/tmp/analyze-sync_with_master-mysqld.2.err ==
 
############################## default ##############################
 
**** SHOW WARNINGS on default ****
SHOW WARNINGS;
Level	Code	Message
 
**** SELECT replication-related variables on default ****
SELECT NOW(), @@SERVER_ID;
NOW()	@@SERVER_ID
2023-09-27 14:58:49	2
 
**** SHOW SLAVE STATUS on default ****
SHOW SLAVE STATUS;
Slave_IO_State	Waiting for master to send event
Master_Host	127.0.0.1
Master_User	root
Master_Port	16000
Connect_Retry	1
Master_Log_File	master-bin.000001
Read_Master_Log_Pos	937
Relay_Log_File	slave-relay-bin.000002
Relay_Log_Pos	835
Relay_Master_Log_File	master-bin.000001
Slave_IO_Running	Yes
Slave_SQL_Running	No
Replicate_Do_DB	
Replicate_Ignore_DB	
Replicate_Do_Table	
Replicate_Ignore_Table	
Replicate_Wild_Do_Table	
Replicate_Wild_Ignore_Table	
Last_Errno	1677
Last_Error	Column 0 of table 'test.t' cannot be converted from type 'int' to type ''
Skip_Counter	0
Exec_Master_Log_Pos	535
Relay_Log_Space	1546
Until_Condition	None
Until_Log_File	
Until_Log_Pos	0
Master_SSL_Allowed	No
Master_SSL_CA_File	
Master_SSL_CA_Path	
Master_SSL_Cert	
Master_SSL_Cipher	
Master_SSL_Key	
Seconds_Behind_Master	NULL
Master_SSL_Verify_Server_Cert	No
Last_IO_Errno	0
Last_IO_Error	
Last_SQL_Errno	1677
Last_SQL_Error	Column 0 of table 'test.t' cannot be converted from type 'int' to type ''
Replicate_Ignore_Server_Ids	
Master_Server_Id	1
Master_SSL_Crl	
Master_SSL_Crlpath	
Using_Gtid	No
Gtid_IO_Pos	
Replicate_Do_Domain_Ids	
Replicate_Ignore_Domain_Ids	
Parallel_Mode	optimistic
SQL_Delay	0
SQL_Remaining_Delay	NULL
Slave_SQL_Running_State	
Slave_DDL_Groups	1
Slave_Non_Transactional_Groups	1
Slave_Transactional_Groups	0
 
**** SHOW MASTER STATUS on default ****
SHOW MASTER STATUS;
File	slave-bin.000001
Position	534
Binlog_Do_DB	
Binlog_Ignore_DB	
 
**** SHOW SLAVE HOSTS on default ****
SHOW SLAVE HOSTS;
 
**** SHOW PROCESSLIST on default ****
SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info	Progress
6	system user		NULL	Slave_IO	0	Waiting for master to send event	NULL	0.000
10	root	localhost	NULL	Query	0	starting	SHOW PROCESSLIST	0.000
 
**** SHOW BINARY LOGS on default ****
SHOW BINARY LOGS;
Log_name	File_size
slave-bin.000001	534
 
**** SHOW BINLOG EVENTS on default ****
binlog_name = 'slave-bin.000001'
SHOW BINLOG EVENTS IN 'slave-bin.000001';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
slave-bin.000001	4	Format_desc	2	256	Server ver: 10.6.16-MariaDB-log, Binlog ver: 4
slave-bin.000001	256	Gtid_list	2	285	[]
slave-bin.000001	285	Binlog_checkpoint	2	328	slave-bin.000001
slave-bin.000001	328	Gtid	1	370	GTID 0-1-1
slave-bin.000001	370	Query	1	534	use `test`; CREATE TABLE t (c VARCHAR(1) CHARACTER SET utf32 COLLATE utf32_turkish_ci) ENGINE=MyISAM
 
**** SHOW RELAYLOG EVENTS on default ****
relaylog_name = 'slave-relay-bin.000002'
SHOW RELAYLOG EVENTS IN 'slave-relay-bin.000002';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
slave-relay-bin.000002	4	Format_desc	2	256	Server ver: 10.6.16-MariaDB-log, Binlog ver: 4
slave-relay-bin.000002	256	Rotate	1	0	master-bin.000001;pos=4
slave-relay-bin.000002	304	Format_desc	1	256	Server ver: 10.6.16-MariaDB-log, Binlog ver: 4
slave-relay-bin.000002	556	Gtid_list	1	285	[]
slave-relay-bin.000002	585	Binlog_checkpoint	1	329	master-bin.000001
slave-relay-bin.000002	629	Gtid	1	371	GTID 0-1-1
slave-relay-bin.000002	671	Query	1	535	use `test`; CREATE TABLE t (c VARCHAR(1) CHARACTER SET utf32 COLLATE utf32_turkish_ci) ENGINE=MyISAM
slave-relay-bin.000002	835	Gtid	1	577	BEGIN GTID 0-1-2
slave-relay-bin.000002	877	Annotate_rows	1	624	INSERT INTO t VALUES (1)
slave-relay-bin.000002	924	Table_map	1	668	table_id: 38 (test.t)
slave-relay-bin.000002	968	Write_rows_v1	1	706	table_id: 38 flags: STMT_END_F
slave-relay-bin.000002	1006	Query	1	779	COMMIT
slave-relay-bin.000002	1079	Gtid	1	821	GTID 0-1-3
slave-relay-bin.000002	1121	Query	1	937	use `test`; DROP TABLE `t` /* generated by server */
connection default;
 
 
 
 - saving '/test/BASE_MD260923-mariadb-10.6.16-linux-x86_64-opt/mysql-test/var/log/main.test-row/' to '/test/BASE_MD260923-mariadb-10.6.16-linux-x86_64-opt/mysql-test/var/log/main.test-row/'
--------------------------------------------------------------------------

Comment by Roel Van de Paar [ 2023-09-27 ]

Findings this far:

  • git-bisect failed for this issue
  • All versions as of 11 Sep 23 do not recognize the sys.statement_performance_analyzer call:

    10.4.32 7d7ea79916a588364f781ba756d98d1a9c358339 (Debug)

    mysqltest: At line 6: query 'CALL sys.statement_performance_analyzer (1,1,1)' failed with wrong errno 1305: 'PROCEDURE sys.statement_performance_analyzer does not exist', instead of 1644...
    

  • However, sys.statement_performance_analyzer seems to have been around for some time, based on https://mariadb.com/kb/en/statement_performance_analyzer/ (created 1 year and 2 months ago)
  • git diff 961b96a5e0dd40512b8fff77dcec273187ccc9fd..d13a57ae8181f2a8fbee86838d5476740e050d50 does not show any relevant changes (as checked by @bar) and does not contain any reference to statement_performance_analyzer:

    10.6$ git diff 961b96a5e0dd40512b8fff77dcec273187ccc9fd..d13a57ae8181f2a8fbee86838d5476740e050d50 | grep -i 'statement.performance.analyzer'
    10.6$ git diff 961b96a5e0dd40512b8fff77dcec273187ccc9fd..d13a57ae8181f2a8fbee86838d5476740e050d50 | grep -i 'statement' | grep -i 'performance'
    10.6$ 
    

  • 10.4 as of today (27 Sep 23) does not recognize the sys.statement_performance_analyzer call:

    10.4.32 47f0135d7aba57d160d1d95ff7f1f6b7bb07c2fb (Optimized)

    mysqltest: At line 6: query 'CALL sys.statement_performance_analyzer (1,1,1)' failed with wrong errno 1305: 'PROCEDURE sys.statement_performance_analyzer does not exist', instead of 1644...
    

    10.4.32 47f0135d7aba57d160d1d95ff7f1f6b7bb07c2fb (Debug)

    mysqltest: At line 6: query 'CALL sys.statement_performance_analyzer (1,1,1)' failed with wrong errno 1305: 'PROCEDURE sys.statement_performance_analyzer does not exist', instead of 1644...
    

  • We know the issue is present in a 10.6 debug build from 26 Sep 23.
Comment by Roel Van de Paar [ 2023-09-27 ]

Bar has a locally modified 10.6 build of 19 Sep 23 rev 59c4e13b789e7564f2b91ccbcab802585bb1387a which also crashes with the same stack. It seems the issue has been introduced between 11 and 19 Sep 23 but somehow indirectly.

Additionally, for an as yet unknown reason, sys.statement_performance_analyzer only became available between these dates.

Comment by Alexander Barkov [ 2023-09-27 ]

This script reproduces the same problem (the crash in debug builds) in all versions starting at least from 10.4:

--source include/have_binlog_format_row.inc
--source include/master-slave.inc
 
CREATE TABLE t1 (a INT);
 
SELECT @@SLAVE_TYPE_CONVERSIONS;
 
--sync_slave_with_master
ALTER TABLE t1 MODIFY a VARCHAR(1) CHARACTER SET utf32;
 
--connection master
INSERT INTO t1 VALUES (1);
 
--connection slave
--wait_for_slave_to_stop
SELECT * FROM t1 ORDER BY a;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
 
--connection master
DROP TABLE t1;
 
--source include/rpl_end.inc

The stack trace is:

mysys/stacktrace.c:215(my_print_stacktrace)[0x14e88e5]
sql/signal_handler.cc:241(handle_fatal_signal)[0xc49cf5]
sigaction.c:0(__restore_rt)[0x7fc58784eb20]
linux/raise.c:51(__GI_raise)[0x7fc58734b625]
stdlib/abort.c:81(__GI_abort)[0x7fc5873348d9]
intl/loadmsgcat.c:509(get_sysdep_segment_value)[0x7fc5873347a9]
assert.c:0(.annobin_assert.c_end)[0x7fc587343a66]
strings/ctype-ucs2.c:2339(my_vsnprintf_utf32)[0x1542a55]
strings/ctype-ucs2.c:2428(my_snprintf_utf32)[0x1542e6a]
sql/field.cc:8118(Field_varstring::sql_rpl_type(String*) const)[0xc28ba5]
sql/rpl_utility_server.cc:1000(table_def::compatible_with(THD*, rpl_group_info*, TABLE*, TABLE**) const)[0xa58402]
sql/log_event_server.cc:5602(Rows_log_event::do_apply_event(rpl_group_info*))[0xdc83d6]
sql/log_event.h:1499(Log_event::apply_event(rpl_group_info*))[0x7c33d7]
sql/slave.cc:3894(apply_event_and_update_pos_apply(Log_event*, THD*, rpl_group_info*, int))[0x7b55de]
sql/slave.cc:4066(apply_event_and_update_pos(Log_event*, THD*, rpl_group_info*))[0x7b5c23]
sql/slave.cc:4427(exec_relay_log_event(THD*, Relay_log_info*, rpl_group_info*))[0x7b675e]
sql/slave.cc:5630(handle_slave_sql)[0x7ba04f]
perfschema/pfs.cc:2203(pfs_spawn_thread)[0xf7f8cb]
nptl/pthread_create.c:480(start_thread)[0x7fc5878434e2]
x86_64/clone.S:97(__GI___clone)[0x7fc5874106c3]

Comment by Alexander Barkov [ 2023-09-27 ]

Elkin, can you please review:
https://github.com/MariaDB/server/commit/5b4e534f05ee4830f3a4166d8239e75fc258104d

Thanks.

Comment by Andrei Elkin [ 2023-10-10 ]

Looks good. Thanks Alexander for spending your time and productively in replication !

Generated at Thu Feb 08 10:29:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.