[MDEV-29365] [DRAFT] binlog_encryption.encrypted_master sometimes failing in bb with timeout Created: 2022-08-23  Updated: 2023-11-28

Status: Open
Project: MariaDB Server
Component/s: Replication, Tests
Affects Version/s: 10.5, 10.8, 10.9, 10.10
Fix Version/s: 10.5, 10.6

Type: Bug Priority: Major
Reporter: Angelique Sklavounos (Inactive) Assignee: Angelique Sklavounos (Inactive)
Resolution: Unresolved Votes: 0
Labels: None


 Description   

https://buildbot.mariadb.org/#/builders/168/builds/13963

10.5 55c648a73

binlog_encryption.encrypted_master 'cbc,innodb,mix' w3 [ fail ]  timeout after 480 seconds
        Test ended at 2022-08-23 06:05:46
 
Test case timeout after 480 seconds
 
== /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/log/encrypted_master.log == 
DROP DATABASE database_name_to_encrypt;
DROP USER new_user_name_to_encrypt;
DROP SERVER server_name_to_encrypt;
#############################
# Final checks for the master
#############################
NOT FOUND /_to_encrypt.*/ in master-bin.0*
NOT FOUND /COMMIT.*/ in master-bin.0*
NOT FOUND /TIMESTAMP.*/ in master-bin.0*
include/save_master_pos.inc
#############################
# Final checks for the slave
#############################
connection server_2;
include/sync_io_with_master.inc
FOUND 1 /_to_encrypt.*/ in slave-relay-bin.0*
FOUND 1 /COMMIT.*/ in slave-relay-bin.0*
FOUND 1 /TIMESTAMP.*/ in slave-relay-bin.0*
include/start_slave.inc
include/sync_slave_sql_with_io.inc
 
 == /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default': 2013 Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 110
 
 == /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/tmp/analyze-timeout-mysqld.2.err ==
mysqltest: Could not open connection 'default': 2013 Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 110
 - found 'core' (0/1)
Core generated by '/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
BFD: warning: /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/log/binlog_encryption.encrypted_master-cbc,innodb,mix/mysqld.2/data/core is truncated: expected core file size >= 691384320, found: 94531584
[New LWP 19212]
[New LWP 19224]
[New LWP 19228]
[New LWP 9380]
[New LWP 19354]
[New LWP 19258]
[New LWP 19245]
[New LWP 19241]
[New LWP 19297]
[New LWP 32504]
[New LWP 19380]
[New LWP 9387]
[New LWP 19261]
[New LWP 10055]
[New LWP 9304]
[New LWP 10044]
[New LWP 10050]
[New LWP 9367]
[New LWP 19263]
[New LWP 7778]
Cannot access memory at address 0x7f77547a2148
Cannot access memory at address 0x7f77547a2140
Failed to read a valid object file image from memory.
Core was generated by `/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd --defaults-group-su'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f77532763b1 in ?? ()
[Current thread is 1 (LWP 19212)]
#0  0x00007f77532763b1 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7ffd22e4e220
Compressed file /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/log/binlog_encryption.encrypted_master-cbc,innodb,mix/mysqld.2/data/core
 - found 'core' (1/1)
Core generated by '/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
BFD: warning: /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/log/binlog_encryption.encrypted_master-cbc,innodb,mix/mysqld.1/data/core is truncated: expected core file size >= 1064722432, found: 503291904
[New LWP 19028]
[New LWP 19049]
[New LWP 2080]
[New LWP 19036]
[New LWP 19190]
[New LWP 2716]
[New LWP 19109]
[New LWP 19053]
[New LWP 19300]
[New LWP 19176]
[New LWP 19372]
[New LWP 19365]
[New LWP 19498]
[New LWP 19382]
[New LWP 2718]
[New LWP 2081]
[New LWP 1781]
[New LWP 19062]
[New LWP 19172]
Cannot access memory at address 0x7f203a496148
Cannot access memory at address 0x7f203a496140
Failed to read a valid object file image from memory.
Core was generated by `/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd --defaults-group-su'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f2038f6a3b1 in ?? ()
[Current thread is 1 (LWP 19028)]
#0  0x00007f2038f6a3b1 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7ffd0e03b4e0

The primary error log shows:
https://ci.mariadb.org/27666/logs/amd64-ubuntu-1804-clang10-asan/mysqld.1.err.3

220823  6:01:05 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.5.18-MariaDB-debug-log
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=5
max_threads=153
thread_count=6
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63754 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
2022-08-23  6:01:05 5 [Warning] Aborted connection 5 to db: 'test' user: 'root' host: 'localhost' (Got an error reading communication packets)
stack_bottom = 0x0 thread_stack 0x100000
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(__interceptor_backtrace+0x5b)[0x76b47b]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(my_print_stacktrace+0x10e)[0x287717e]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(handle_fatal_signal+0x67e)[0x129466e]
??:0(__restore_rt)[0x7f2038f6d980]
??:0(__poll)[0x7f20382d1cb9]
sanitizer_common/sanitizer_common_interceptors.inc:3909(poll)[0x768c78]
sql/mysqld.cc:6253(handle_connections_sockets())[0x7f1b64]
psi/mysql_thread.h:740(inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int))[0x7e7dcb]
??:0(__libc_start_main)[0x7f20381debf7]
??:0(_start)[0x73445a]

The replica error log shows:
https://ci.mariadb.org/27666/logs/amd64-ubuntu-1804-clang10-asan/mysqld.2.err.3

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x100000
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(__interceptor_backtrace+0x5b)[0x76b47b]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(my_print_stacktrace+0x10e)[0x287717e]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(handle_fatal_signal+0x67e)[0x129466e]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12980)[0x7f7753279980]
??:0(__poll)[0x7f77525ddcb9]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(__interceptor_poll+0x78)[0x768c78]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(_Z26handle_connections_socketsv+0x614)[0x7f1b64]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(_Z11mysqld_mainiPPc+0x1fab)[0x7e7dcb]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f77524eabf7]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(_start+0x2a)[0x73445a]

The test log is a bit confusing because it later reports:

binlog_encryption.encrypted_master 'cbc,innodb,mix' w3 [ retry-pass ]  247066
Retrying test binlog_encryption.encrypted_master, attempt(3/3)...

and then the test fails with:

binlog_encryption.encrypted_master 'cbc,innodb,mix' w3 [ retry-fail ]  timeout after 480 seconds
        Test ended at 2022-08-23 06:31:27
Test case timeout after 480 seconds
== /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/log/encrypted_master.log == 
CREATE DATABASE database_name_to_encrypt;
USE database_name_to_encrypt;
CREATE USER user_name_to_encrypt;
GRANT ALL ON database_name_to_encrypt.* TO user_name_to_encrypt;
SET PASSWORD FOR user_name_to_encrypt = PASSWORD('password_to_encrypt');
CREATE TABLE innodb_table_name_to_encrypt (
int_column_name_to_encrypt INT PRIMARY KEY,
timestamp_column_name_to_encrypt TIMESTAMP(6) NULL,
blob_column_name_to_encrypt BLOB,
virt_column_name_to_encrypt INT AS (int_column_name_to_encrypt % 10) VIRTUAL,
pers_column_name_to_encrypt INT AS (int_column_name_to_encrypt) PERSISTENT,
INDEX `index_name_to_encrypt`(`timestamp_column_name_to_encrypt`)
) ENGINE=InnoDB 
PARTITION BY RANGE (int_column_name_to_encrypt)
SUBPARTITION BY KEY (int_column_name_to_encrypt)
SUBPARTITIONS 2 (
PARTITION partition0_name_to_encrypt VALUES LESS THAN (100),
PARTITION partition1_name_to_encrypt VALUES LESS THAN (MAXVALUE)
)
;
 == /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/tmp/analyze-timeout-mysqld.2.err ==
mysqltest: Could not open connection 'default': 2013 Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 110
 == /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default': 2013 Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 110
 - found 'core' (1/1)
Core generated by '/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
BFD: warning: /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/log/binlog_encryption.encrypted_master-cbc,innodb,mix/mysqld.2/data/core is truncated: expected core file size >= 991543296, found: 69337088
[New LWP 22746]
[New LWP 22842]
[New LWP 22920]
[New LWP 22807]
[New LWP 22928]
[New LWP 22788]
[New LWP 22835]
[New LWP 22924]
[New LWP 22993]
[New LWP 6885]
[New LWP 11270]
[New LWP 12366]
Cannot access memory at address 0x7fedd6280148
Cannot access memory at address 0x7fedd6280140
Failed to read a valid object file image from memory.
Core was generated by `/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd --defaults-group-su'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fedd4d543b1 in ?? ()
[Current thread is 1 (LWP 22746)]
#0  0x00007fedd4d543b1 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7ffd015418a0
 - deleting it, already saved 1
 - found 'core' (1/1)
Core generated by '/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
BFD: warning: /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/3/log/binlog_encryption.encrypted_master-cbc,innodb,mix/mysqld.1/data/core is truncated: expected core file size >= 1024249856, found: 77586432
[New LWP 22464]
[New LWP 16674]
[New LWP 22538]
[New LWP 22567]
[New LWP 22583]
[New LWP 23122]
[New LWP 10881]
[New LWP 12631]
[New LWP 22507]
[New LWP 16659]
[New LWP 16666]
[New LWP 16628]
[New LWP 13404]
[New LWP 3105]
[New LWP 16636]
[New LWP 10812]
[New LWP 22541]
[New LWP 22515]
[New LWP 22566]
Cannot access memory at address 0x7f0b1eb2a148
Cannot access memory at address 0x7f0b1eb2a140
Failed to read a valid object file image from memory.
Core was generated by `/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd --defaults-group-su'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f0b1d5fe3b1 in ?? ()
[Current thread is 1 (LWP 22464)]
#0  0x00007f0b1d5fe3b1 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7ffed50f4ee0

The primary error log for this fail shows:

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
2022-08-23  6:27:56 16 [Warning] Aborted connection 16 to db: 'test' user: 'root' host: 'localhost' (Got an error reading communication packets)
2022-08-23  6:27:56 18 [Warning] Aborted connection 18 to db: 'test' user: 'root' host: 'localhost' (Got an error reading communication packets)
stack_bottom = 0x0 thread_stack 0x100000
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(__interceptor_backtrace+0x5b)[0x76b47b]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(my_print_stacktrace+0x10e)[0x287717e]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(handle_fatal_signal+0x67e)[0x129466e]
??:0(__restore_rt)[0x7f0b1d601980]
??:0(__poll)[0x7f0b1c965cb9]
mysys/my_addr_resolve.c:299(my_addr_resolve)[0x768c78]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(_Z26handle_connections_socketsv+0x614)[0x7f1b64]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(_Z11mysqld_mainiPPc+0x1fab)[0x7e7dcb]
??:0(__libc_start_main)[0x7f0b1c872bf7]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(_start+0x2a)[0x73445a]

The replica error log shows:

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x100000
mysys/my_addr_resolve.c:299(my_addr_resolve)[0x76b47b]
/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd(my_print_stacktrace+0x10e)[0x287717e]
include/buf0buf.h:2178(buf_pool_t::release_freed_page(buf_page_t*))[0x129466e]
??:0(__restore_rt)[0x7fedd4d57980]
??:0(__poll)[0x7fedd40bbcb9]
sanitizer_common/sanitizer_common_interceptors.inc:3909(poll)[0x768c78]
sql/mysqld.cc:6253(handle_connections_sockets())[0x7f1b64]
psi/mysql_thread.h:740(inline_mysql_mutex_lock(st_mysql_mutex*, char const*, unsigned int))[0x7e7dcb]
??:0(__libc_start_main)[0x7fedd3fc8bf7]
mysys/my_addr_resolve.c:299(my_addr_resolve)[0x73445a]



 Comments   
Comment by Angelique Sklavounos (Inactive) [ 2022-09-06 ]

Same seen here: https://buildbot.mariadb.org/#/builders/168/builds/14284

10.6 19de8b701

rpl.rpl_row_drop_create_temp_table 'innodb,row' w12 [ fail ]  timeout after 480 seconds
        Test ended at 2022-09-06 06:07:25
 
Test case timeout after 480 seconds
 
== /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/12/log/rpl_row_drop_create_temp_table.log == 
INSERT INTO tt_error_1() VALUES (25), (25);
ERROR 23000: Duplicate entry '25' for key 'PRIMARY'
INSERT INTO nt_error_1() VALUES (15), (15);
ERROR 23000: Duplicate entry '15' for key 'PRIMARY'
COMMIT;
-b-b-b-b-b-b-b-b-b-b-b- >> B T N-Temp Te Ne C << -b-b-b-b-b-b-b-b-b-b-b-
include/show_binlog_events.inc
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
master-bin.000001	#	Gtid	#	#	BEGIN GTID #-#-#
master-bin.000001	#	Annotate_rows	#	#	INSERT INTO nt_error_1() VALUES (15), (15)
master-bin.000001	#	Table_map	#	#	table_id: # (test.nt_error_1)
master-bin.000001	#	Write_rows_v1	#	#	table_id: # flags: STMT_END_F
master-bin.000001	#	Query	#	#	COMMIT
master-bin.000001	#	Gtid	#	#	BEGIN GTID #-#-#
master-bin.000001	#	Annotate_rows	#	#	INSERT INTO tt_xx_1() VALUES (1)
master-bin.000001	#	Table_map	#	#	table_id: # (test.tt_xx_1)
master-bin.000001	#	Write_rows_v1	#	#	table_id: # flags: STMT_END_F
master-bin.000001	#	Xid	#	#	COMMIT /* XID */
-e-e-e-e-e-e-e-e-e-e-e- >> B T N-Temp Te Ne C << -e-e-e-e-e-e-e-e-e-e-e-
 
 
 == /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/12/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default': 2013 Lost connection to server at 'handshake: reading initial communication packet', system error: 110
 
 == /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/12/tmp/analyze-timeout-mysqld.2.err ==
mysqltest: Could not open connection 'default': 2013 Lost connection to server at 'handshake: reading initial communication packet', system error: 110
 - found 'core' (0/1)
Core generated by '/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
BFD: warning: /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/12/log/rpl.rpl_row_drop_create_temp_table-innodb,row/mysqld.2/data/core is truncated: expected core file size >= 720351232, found: 64794624
[New LWP 63371]
[New LWP 63448]
[New LWP 63483]
[New LWP 63549]
[New LWP 63461]
[New LWP 63492]
[New LWP 63546]
[New LWP 63566]
[New LWP 63557]
[New LWP 104838]
[New LWP 64430]
[New LWP 64435]
[New LWP 65912]
[New LWP 68173]
[New LWP 68174]
[New LWP 93055]
[New LWP 84020]
[New LWP 104815]
[New LWP 104836]
Cannot access memory at address 0x7f7a0ab3b148
Cannot access memory at address 0x7f7a0ab3b140
Failed to read a valid object file image from memory.
Core was generated by `/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd --defaults-group-su'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f7a0960f3b1 in ?? ()
[Current thread is 1 (LWP 63371)]
#0  0x00007f7a0960f3b1 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7ffd5f67daa0
Compressed file /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/12/log/rpl.rpl_row_drop_create_temp_table-innodb,row/mysqld.2/data/core
 - found 'core' (1/1)
Core generated by '/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd'
Output from gdb follows. The first stack trace is from the failing thread.
The following stack traces are from all threads (so the failing one is
duplicated).
--------------------------
BFD: warning: /buildbot/amd64-ubuntu-1804-clang10-asan/build/mysql-test/var/12/log/rpl.rpl_row_drop_create_temp_table-innodb,row/mysqld.1/data/core is truncated: expected core file size >= 777564160, found: 79859712
[New LWP 62051]
[New LWP 62169]
[New LWP 62105]
[New LWP 62136]
[New LWP 62132]
[New LWP 62183]
[New LWP 62119]
[New LWP 62365]
[New LWP 62174]
[New LWP 62402]
[New LWP 62357]
[New LWP 64567]
[New LWP 62412]
[New LWP 97296]
[New LWP 97933]
[New LWP 101449]
[New LWP 103796]
[New LWP 62349]
[New LWP 103911]
[New LWP 62249]
Cannot access memory at address 0x7fd776cb4148
Cannot access memory at address 0x7fd776cb4140
Failed to read a valid object file image from memory.
Core was generated by `/buildbot/amd64-ubuntu-1804-clang10-asan/build/sql/mariadbd --defaults-group-su'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fd7757883b1 in ?? ()
[Current thread is 1 (LWP 62051)]
#0  0x00007fd7757883b1 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffe8595860

Generated at Thu Feb 08 10:07:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.