[MDEV-33194] MTR tests fail with PS protocol on ASAN builds Created: 2024-01-07  Updated: 2024-01-08

Status: Confirmed
Project: MariaDB Server
Component/s: Prepared Statements, Tests
Affects Version/s: 10.4, 10.5, 10.6, 10.11, 11.0, 11.1, 11.2, 11.3, 11.4
Fix Version/s: 10.4, 10.5, 10.6, 10.11, 11.0, 11.1, 11.2, 11.3

Type: Bug Priority: Critical
Reporter: Elena Stepanova Assignee: Lena Startseva
Resolution: Unresolved Votes: 0
Labels: regression
Environment:

gcc (Debian 12.2.0-14) 12.2.0 /
cmake . -DPLUGIN_TOKUDB=NO -DPLUGIN_COLUMNSTORE=NO -DPLUGIN_XPAND=NO -DWITH_EMBEDDED_SERVER=YES -DCMAKE_BUILD_TYPE=Debug -DWITH_ASAN=YES -DMYSQL_MAINTAINER_MODE=OFF -DCMAKE_C_FLAGS=-fno-omit-frame-pointer -DCMAKE_CXX_FLAGS=-fno-omit-frame-pointer -DWITH_SAFEMALLOC=OFF



 Description   

A number of tests from the default set fail with ps-protocol on ASAN builds.
The failure started happening after this commit in 10.4:

commit 9e701518e7d0a541ba580b6a6afc9e9bcd522f23
Author: Lena Startseva
Date:   Fri Sep 15 17:48:03 2023 +0700
 
    MDEV-32177: Add to the ps-protocol a comparison of the result

On 10.4 it fails "no output" error:

10.4 613d0194979849fb5b3dd752f13b14672a2409e3

main.connect                             [ fail ]
        Test ended at 2024-01-07 20:53:25
 
CURRENT_TEST: main.connect
 
 
mysqltest failed but provided no output
The result from queries just before the failure was:
< snip >
NOT FOUND /This connection closed normally without authentication/ in mysqld.1.err
# let tcp to detect disconnect
select sleep(1);
sleep(1)
0
SHOW GLOBAL STATUS LIKE 'Aborted_connects%';
Variable_name	Value
Aborted_connects	1
Aborted_connects_preauth	1
FOUND 1 /This connection closed normally without authentication/ in mysqld.1.err
SET @save_connect_timeout= @@connect_timeout;
SET GLOBAL connect_timeout=2;
SHOW GLOBAL STATUS LIKE 'Aborted_connects%';
Variable_name	Value
Aborted_connects	2
Aborted_connects_preauth	2
FOUND 2 /This connection closed normally without authentication/ in mysqld.1.err
SET GLOBAL log_warnings=default;
SET GLOBAL connect_timeout= @save_connect_timeout;
# End of 10.4 tests

On 10.5+ they hang instead and eventually fail with a timeout.



 Comments   
Comment by Marko Mäkelä [ 2024-01-08 ]

I got some more detail on this:

ASAN_OPTIONS=log_path=/dev/shm/foo:abort_on_error=1 ./mtr --ps-protocol main.connect

For me, this produced the file /dev/shm/foo.23414 where I saw the following, while the test would appear to hang:

10.4 613d0194979849fb5b3dd752f13b14672a2409e3

==23414==ERROR: AddressSanitizer: heap-use-after-free on address 0x6200000004f8 at pc 0x56149e1fe256 bp 0x7fffbd173e70 sp 0x7fffbd173e68
READ of size 8 at 0x6200000004f8 thread T0
    #0 0x56149e1fe255 in dump_backtrace /mariadb/10.4/client/mysqltest.cc:9648
    #1 0x56149e1fe2d5 in signal_handler /mariadb/10.4/client/mysqltest.cc:9673
    #2 0x7f717405a50f  (/lib/x86_64-linux-gnu/libc.so.6+0x3c50f) (BuildId: 0f4e3614a0c0f27b92e61184fb4fd70ab52696fc)
    #3 0x7f71740a80fb in __pthread_kill_implementation nptl/pthread_kill.c:43
    #4 0x7f717405a471 in __GI_raise ../sysdeps/posix/raise.c:26
    #5 0x7f71740444b1 in __GI_abort stdlib/abort.c:79
    #6 0x7f7174af71ce in __sanitizer::Abort() ../../../../src/libsanitizer/sanitizer_common/sanitizer_posix_libcdep.cpp:143
    #7 0x7f7174b066c0 in __sanitizer::Die() ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:58
    #8 0x7f7174b0fa7b in __lsan::HandleLeaks() ../../../../src/libsanitizer/lsan/lsan_common_linux.cpp:120
    #9 0x7f7174b0df24 in __lsan::DoLeakCheck() ../../../../src/libsanitizer/lsan/lsan_common.cpp:803
    #10 0x7f717405c402 in __cxa_finalize stdlib/cxa_finalize.c:82
    #11 0x7f7174a25bf6  (/lib/x86_64-linux-gnu/libasan.so.8+0x25bf6) (BuildId: e863f0cb2bca0b741acc611cc6605b46ca8d71f5)

Apparently, this stack trace reporter can’t be disabled by a command line option. I disabled it with a patch:

diff --git a/client/mysqltest.cc b/client/mysqltest.cc
index b25f4621ec0..a67a7242171 100644
--- a/client/mysqltest.cc
+++ b/client/mysqltest.cc
@@ -9731,7 +9731,6 @@ static void init_signal_handling(void)
   sa.sa_handler= signal_handler;
 
   sigaction(SIGSEGV, &sa, NULL);
-  sigaction(SIGABRT, &sa, NULL);
 #ifdef SIGBUS
   sigaction(SIGBUS, &sa, NULL);
 #endif

With that patch and the invocation

ASAN_OPTIONS=log_path=/dev/shm/foo:abort_on_error=1:fast_unwind_on_malloc=0 ./mtr --ps-protocol main.connect

I got a detailed memory leak report with a full stack trace of the leaked allocation:

10.4 613d0194979849fb5b3dd752f13b14672a2409e3

=================================================================
==25003==ERROR: LeakSanitizer: detected memory leaks
 
Direct leak of 3128 byte(s) in 23 object(s) allocated from:
    #0 0x7f22eb4d7f97 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x55d251f88e39 in mysql_stmt_result_metadata /mariadb/10.4/libmariadb/libmariadb/mariadb_stmt.c:2271
    #2 0x55d251f54488 in run_query_stmt(st_connection*, st_command*, char*, unsigned long, st_dynamic_string*, st_dynamic_string*) /mariadb/10.4/client/mysqltest.cc:8584
    #3 0x55d251f55cd0 in run_query(st_connection*, st_command*, int) /mariadb/10.4/client/mysqltest.cc:9335
    #4 0x55d251f57503 in main /mariadb/10.4/client/mysqltest.cc:10169
    #5 0x7f22eaa456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #6 0x7f22eaa45784 in __libc_start_main_impl ../csu/libc-start.c:360
    #7 0x55d251f32160 in _start (/dev/shm/10.4a/client/mysqltest+0x431160) (BuildId: ac7fc9a933c14aae)
 
SUMMARY: AddressSanitizer: 3128 byte(s) leaked in 23 allocation(s).

Generated at Thu Feb 08 10:37:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.