[MDEV-23534] SIGSEGV in sf_malloc_usable_size/my_free on SET GLOBAL REPLICATE_DO_TABLE Created: 2020-08-22  Updated: 2021-04-15  Resolved: 2020-09-02

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.1, 10.2, 10.3, 10.4
Fix Version/s: 10.1.48, 10.2.35, 10.3.26, 10.4.16

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Sujatha Sivakumar (Inactive)
Resolution: Fixed Votes: 0
Labels: affects-tests, not-10.5

Issue Links:
Relates
relates to MDEV-22059 MSAN report at replicate_ignore_table... Closed
relates to MDEV-23657 SIGSEGV in malloc_size_and_flag from ... Closed

 Description   

USE test;
SET SESSION default_master_connection='a';
CREATE TABLE t(a INT) UNION=(t);
CHANGE MASTER TO MASTER_USER='a', MASTER_PASSWORD='a';
SET GLOBAL REPLICATE_DO_TABLE=NULL;

Leads to:

10.4.15 eae968f62d285de97ed607c87bc131cd863d5d03 (Debug)

Core was generated by `/test/MD110820-mariadb-10.4.15-linux-x86_64-dbg/bin/mysqld --no-defaults --core'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=11)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
[Current thread is 1 (Thread 0x15366c344700 (LWP 1432627))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
#1  0x0000564c6424e8a6 in my_write_core (sig=sig@entry=11) at /test/10.4_dbg/mysys/stacktrace.c:482
#2  0x0000564c639cacdc in handle_fatal_signal (sig=11) at /test/10.4_dbg/sql/signal_handler.cc:343
#3  <signal handler called>
#4  sf_malloc_usable_size (ptr=ptr@entry=0x38, is_thread_specific=is_thread_specific@entry=0x15366c340fff "") at /test/10.4_dbg/mysys/safemalloc.c:215
#5  0x0000564c64249b76 in my_free (ptr=0x38) at /test/10.4_dbg/mysys/my_malloc.c:213
#6  0x0000564c642264ab in delete_dynamic (array=array@entry=0x15364480ef28) at /test/10.4_dbg/mysys/array.c:302
#7  0x0000564c6422a6e7 in my_hash_free (hash=hash@entry=0x15364480ef00) at /test/10.4_dbg/mysys/hash.c:158
#8  0x0000564c6362f152 in Rpl_filter::set_do_table (this=this@entry=0x15364480ef00, table_spec=table_spec@entry=0x0) at /test/10.4_dbg/sql/rpl_filter.cc:358
#9  0x0000564c6386ee4f in Sys_var_rpl_filter::set_filter_value (this=this@entry=0x564c64ea7320 <Sys_replicate_do_table>, value=0x0, mi=mi@entry=0x153644900000) at /test/10.4_dbg/sql/sys_vars.cc:5028
#10 0x0000564c6386efd9 in Sys_var_rpl_filter::global_update (this=0x564c64ea7320 <Sys_replicate_do_table>, thd=<optimized out>, var=0x15364486d1e8) at /test/10.4_dbg/sql/sys_vars.cc:5007
#11 0x0000564c63633a3a in sys_var::update (this=0x564c64ea7320 <Sys_replicate_do_table>, thd=0x153644815070, var=0x15364486d1e8) at /test/10.4_dbg/sql/set_var.cc:208
#12 0x0000564c63633f75 in set_var::update (this=<optimized out>, thd=<optimized out>) at /test/10.4_dbg/sql/set_var.cc:837
#13 0x0000564c636352c2 in sql_set_variables (thd=thd@entry=0x153644815070, var_list=var_list@entry=0x153644819ea8, free=free@entry=true) at /test/10.4_dbg/sql/set_var.cc:740
#14 0x0000564c6371bde0 in mysql_execute_command (thd=thd@entry=0x153644815070) at /test/10.4_dbg/sql/sql_parse.cc:4942
#15 0x0000564c63722090 in mysql_parse (thd=thd@entry=0x153644815070, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x15366c343460, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.4_dbg/sql/sql_parse.cc:7896
#16 0x0000564c63724920 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x153644815070, packet=packet@entry=0x153644857071 "SET GLOBAL replicate_do_TABLE=NULL", packet_length=packet_length@entry=34, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /test/10.4_dbg/sql/sql_parse.cc:1834
#17 0x0000564c6372835b in do_command (thd=0x153644815070) at /test/10.4_dbg/sql/sql_parse.cc:1352
#18 0x0000564c638548b6 in do_handle_one_connection (connect=connect@entry=0x153669035790) at /test/10.4_dbg/sql/sql_connect.cc:1412
#19 0x0000564c638549d6 in handle_one_connection (arg=0x153669035790) at /test/10.4_dbg/sql/sql_connect.cc:1316
#20 0x000015366b5426db in start_thread (arg=0x15366c344700) at pthread_create.c:463
#21 0x000015366a6bca3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.2.34 (dbg), 10.3.25 (dbg), 10.4.15 (dbg)

Bug confirmed not present in:
MariaDB: 10.1.47 (dbg), 10.1.47 (opt), 10.2.34 (opt), 10.3.25 (opt), 10.4.15 (opt), 10.5.6 (dbg), 10.5.6 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.47 (dbg), 5.6.47 (opt), 5.7.29 (dbg), 5.7.29 (opt), 8.0.19 (dbg), 8.0.19 (opt)



 Comments   
Comment by Roel Van de Paar [ 2020-08-22 ]

The stack and filtering thereof (unique bugid SIGSEGV|sf_malloc_usable_size|my_free|delete_dynamic|my_hash_free) may mask other bugs. A fix would be appreciated to avoid masking other bugs.

Comment by Roel Van de Paar [ 2020-08-22 ]

On cleanup of my runs, regrettably quite a few trials were deleted that had the same uniqueID; there may be other cases leading to the same assert, and if so that would confirm my last comment. I may do another separate run later with this particular filter removed, and a fix would be great to avoid masking.

Comment by Roel Van de Paar [ 2020-08-25 ]

This must be a recent regression, as it does not crash an 10.4.14 ASAN build (nor give any ASAN warnings), but crashes on 10.4.15

Not sure. It does not seem to crash on ASAN at all (even new 10.4.15 build). Removing 'regression' tag for the time being.

Comment by Sujatha Sivakumar (Inactive) [ 2020-08-25 ]

Confirmed the issue by verifying it locally. Current issue is a duplicate of MDEV-22059 which is fixed in 10.5. Cherry-picked MDEV-22059 fix to 10.4 and observed that fix addresses the issue. Need to add a test case.

Comment by Sujatha Sivakumar (Inactive) [ 2020-08-26 ]

To reproduce with MTR, execute test with valgrind: `replicate_do_table` filter

==29406== Conditional jump or move depends on uninitialised value(s)
==29406==    at 0x162A2AF: delete_dynamic (array.c:301)
==29406==    by 0x1630CF8: my_hash_free (hash.c:158)
==29406==    by 0x82301C: Rpl_filter::set_do_table(char const*) (rpl_filter.cc:358)
==29406==    by 0xAEF693: Sys_var_rpl_filter::set_filter_value(char const*, Master_info*) (sys_vars.cc:5028)
==29406==    by 0xAEF5B2: Sys_var_rpl_filter::global_update(THD*, set_var*) (sys_vars.cc:5007)
==29406==    by 0x828468: sys_var::update(THD*, set_var*) (set_var.cc:208)
==29406==    by 0x82A1DB: set_var::update(THD*) (set_var.cc:837)
==29406==    by 0x829E6F: sql_set_variables(THD*, List<set_var_base>*, bool) (set_var.cc:740)
==29406==    by 0x94A026: mysql_execute_command(THD*) (sql_parse.cc:4942)
==29406==    by 0x953DF2: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:7896)
==29406==    by 0x9401E0: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1834)
==29406==    by 0x93E981: do_command(THD*) (sql_parse.cc:1352)
==29406==    by 0xACEF0D: do_handle_one_connection(CONNECT*) (sql_connect.cc:1412)
==29406==    by 0xACEC5C: handle_one_connection (sql_connect.cc:1316)
==29406==    by 0x15252D7: pfs_spawn_thread (pfs.cc:1869)
==29406==    by 0x5B436DA: start_thread (pthread_create.c:463)
^ Found warnings in /home/sujatha/bug_repo/test-10.4/bld/mysql-test/var/log/mysqld.1.err

Filter: `replicate_wild_ignore_table` also reports following valgrind error. (This is fixed in 10.5 as part of MDEV-22317)

==8717==    by 0x15252D7: pfs_spawn_thread (pfs.cc:1869)
==8717==    by 0x5B436DA: start_thread (pthread_create.c:463)
==8717==    by 0x6A0AA3E: clone (clone.S:95)
==8717== Conditional jump or move depends on uninitialised value(s)
==8717==    at 0x162A2AF: delete_dynamic (array.c:301)
==8717==    by 0x8232E3: Rpl_filter::set_wild_do_table(char const*) (rpl_filter.cc:420)
==8717==    by 0xAEF6EA: Sys_var_rpl_filter::set_filter_value(char const*, Master_info*) (sys_vars.cc:5037)
==8717==    by 0xAEF5B2: Sys_var_rpl_filter::global_update(THD*, set_var*) (sys_vars.cc:5007)
==8717==    by 0x828468: sys_var::update(THD*, set_var*) (set_var.cc:208)
==8717==    by 0x82A1DB: set_var::update(THD*) (set_var.cc:837)
==8717==    by 0x829E6F: sql_set_variables(THD*, List<set_var_base>*, bool) (set_var.cc:740)
==8717==    by 0x94A026: mysql_execute_command(THD*) (sql_parse.cc:4942)
==8717==    by 0x953DF2: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:7896)
==8717==    by 0x9401E0: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1834)
==8717==    by 0x93E981: do_command(THD*) (sql_parse.cc:1352)
==8717==    by 0xACEF0D: do_handle_one_connection(CONNECT*) (sql_connect.cc:1412)
==8717==    by 0xACEC5C: handle_one_connection (sql_connect.cc:1316)
==8717==    by 0x15252D7: pfs_spawn_thread (pfs.cc:1869)
==8717==    by 0x5B436DA: start_thread (pthread_create.c:463)
==8717==    by 0x6A0AA3E: clone (clone.S:95)

Hence backporting both the fixes to lower versions. i.e MDEV-22059 and MDEV-22317.

Comment by Sujatha Sivakumar (Inactive) [ 2020-08-27 ]

Hello Andrei,

Can you please review the fix for MDEV-23534.

To verify the issue tests need to be executed with '--valgrind' option.

It is a backport of following bugfixes from 10.5.

MDEV-22317: SIGSEGV in my_free/delete_dynamic in optimized builds (ARIA)
MDEV-22059: MSAN report at replicate_ignore_table_grant

Patch: https://github.com/MariaDB/server/commit/2f859962b032cc75176e05df8d704eec413cdb17

BuildBot Testing: http://buildbot.askmonty.org/buildbot/grid?category=main&branch=bb-10.1-sujatha

Thank you.

Comment by Andrei Elkin [ 2020-08-28 ]

Could you please refer to the orginal patches with git commit id:s. Othewise the commit is good.
Thanks!

Comment by Sujatha Sivakumar (Inactive) [ 2020-09-02 ]

Tested backported changes on 10.1, 10.2, 10.3 and 10.4.

Please null merge to 10.5.

No Merge conflicts were observed. All patches were tested on buildbot and results were fine.

10.2 changes: https://github.com/MariaDB/server/commit/1585382cdbec831487b0679cc12a57bfd8d71f80
10.3 changes: https://github.com/MariaDB/server/commit/b252ca1b124b8b5b95e83d0e05f3fecca6f0b2be
10.4 changes: https://github.com/MariaDB/server/commit/214556f6267534e4170aebf2c4561da7274eb168

Generated at Thu Feb 08 09:23:07 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.