[MDEV-15468] Server crashes in table_events_waits_common::make_row upon race condition while querying perfschema Created: 2018-03-05  Updated: 2023-06-08  Resolved: 2023-06-08

Status: Closed
Project: MariaDB Server
Component/s: Admin statements
Affects Version/s: 10.1, 10.2, 10.3
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Oleksandr Byelkin
Resolution: Cannot Reproduce Votes: 0
Labels: None


 Description   

The problem is reproducible very unreliably, sometimes it's fast, sometimes it takes hundreds of repetitions. Luckily, the test is short. Run with --repeat=1000.

Make sure that your plugin directory is populated. It might have something to do with TokuDB in general or MDEV-15174 in particular – I can't get the failure without ha_tokudb in the plugin directory.

The test case (and bug report) is for 10.3, because I couldn't get rid of INTERSECT in the scenario. Maybe when the reason is known, it will be possible to reproduce with the syntax applicable to previous versions.

--source include/have_perfschema.inc
 
SELECT COUNT(*) FROM INFORMATION_SCHEMA.ALL_PLUGINS;
UPDATE performance_schema.setup_instruments SET ENABLED = 'YES' WHERE NAME IN ('stage/sql/Commit_implicit','statement/sql/show_generic');
UPDATE performance_schema.setup_consumers SET enabled = 'YES' WHERE NAME IN ( 'events_waits_history' );
 
--connect (con1,localhost,root,,test)
--connect (con2,localhost,root,,test)
--send
( SELECT * FROM INFORMATION_SCHEMA.ALL_PLUGINS AS pl ORDER BY pl.PLUGIN_AUTH_VERSION ) INTERSECT ( SELECT * FROM INFORMATION_SCHEMA.ALL_PLUGINS AS pl ORDER BY pl.PLUGIN_AUTH_VERSION );
 
--connection con1
DROP TABLE IF EXISTS t1, t2;
CREATE PROCEDURE sp() SELECT * FROM mysql.time_zone HAVING Use_leap_seconds = 'Y' AND Use_leap_seconds = 'Y';
CALL sp;
DROP PROCEDURE sp;
SELECT * FROM performance_schema.events_waits_history ORDER BY TIMER_START;
 
--connection con2
--reap

10.3 d70573564cb2701076083ea7f55ae9539fad13b4

#3  <signal handler called>
#4  0x00005623230b78b1 in dirname_length (name=0x7f6f8737eae0 <error: Cannot access memory at address 0x7f6f8737eae0>) at /data/src/10.3/mysys/mf_dirname.c:33
#5  0x0000562322c0bba3 in table_events_waits_common::make_row (this=0x7f6f7404bf30, thread_own_wait=true, pfs_thread=0x7f6f8fd6a0c0, wait=0x7f6f8fd0d1a8) at /data/src/10.3/storage/perfschema/table_events_waits.cc:429
#6  0x0000562322c0c857 in table_events_waits_history::rnd_next (this=0x7f6f7404bf30) at /data/src/10.3/storage/perfschema/table_events_waits.cc:883
#7  0x0000562322bedf13 in ha_perfschema::rnd_next (this=0x7f6f7403c328, buf=0x7f6f74044bc8 "\340\063\t") at /data/src/10.3/storage/perfschema/ha_perfschema.cc:344
#8  0x0000562322921ac6 in handler::ha_rnd_next (this=0x7f6f7403c328, buf=0x7f6f74044bc8 "\340\063\t") at /data/src/10.3/sql/handler.cc:2759
#9  0x000056232291600e in find_all_keys (thd=0x7f6f74000b00, param=0x7f6f8c1fe560, select=0x7f6f74049bb8, fs_info=0x7f6f7404bd40, buffpek_pointers=0x7f6f8c1fe760, tempfile=0x7f6f8c1fe5f0, pq=0x0, found_rows=0x7f6f7404bf20) at /data/src/10.3/sql/filesort.cc:790
#10 0x0000562322914570 in filesort (thd=0x7f6f74000b00, table=0x7f6f7403b6e0, filesort=0x7f6f74049d88, tracker=0x7f6f7404a588, join=0x7f6f74018078, first_table_bit=1) at /data/src/10.3/sql/filesort.cc:278
#11 0x00005623226b0b18 in create_sort_index (thd=0x7f6f74000b00, join=0x7f6f74018078, tab=0x7f6f74049268, fsort=0x7f6f74049d88) at /data/src/10.3/sql/sql_select.cc:22359
#12 0x00005623226ab426 in st_join_table::sort_table (this=0x7f6f74049268) at /data/src/10.3/sql/sql_select.cc:20194
#13 0x00005623226ab03a in join_init_read_record (tab=0x7f6f74049268) at /data/src/10.3/sql/sql_select.cc:20135
#14 0x00005623226a8fca in sub_select (join=0x7f6f74018078, join_tab=0x7f6f74049268, end_of_records=false) at /data/src/10.3/sql/sql_select.cc:19229
#15 0x00005623226a8595 in do_select (join=0x7f6f74018078, procedure=0x0) at /data/src/10.3/sql/sql_select.cc:18772
#16 0x0000562322681646 in JOIN::exec_inner (this=0x7f6f74018078) at /data/src/10.3/sql/sql_select.cc:3987
#17 0x0000562322680ade in JOIN::exec (this=0x7f6f74018078) at /data/src/10.3/sql/sql_select.cc:3781
#18 0x0000562322681d1f in mysql_select (thd=0x7f6f74000b00, tables=0x7f6f74013c20, wild_num=1, fields=..., conds=0x0, og_num=1, order=0x7f6f74014398, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7f6f74018058, unit=0x7f6f74004938, select_lex=0x7f6f740050b0) at /data/src/10.3/sql/sql_select.cc:4186
#19 0x0000562322673ffd in handle_select (thd=0x7f6f74000b00, lex=0x7f6f74004870, result=0x7f6f74018058, setup_tables_done_option=0) at /data/src/10.3/sql/sql_select.cc:382
#20 0x000056232263f153 in execute_sqlcom_select (thd=0x7f6f74000b00, all_tables=0x7f6f74013c20) at /data/src/10.3/sql/sql_parse.cc:6552
#21 0x000056232263576d in mysql_execute_command (thd=0x7f6f74000b00) at /data/src/10.3/sql/sql_parse.cc:3763
#22 0x0000562322642b4c in mysql_parse (thd=0x7f6f74000b00, rawbuf=0x7f6f74013988 "SELECT * FROM performance_schema.events_waits_history ORDER BY TIMER_START", length=74, parser_state=0x7f6f8c2005d0, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:8009
#23 0x0000562322630200 in dispatch_command (command=COM_QUERY, thd=0x7f6f74000b00, packet=0x7f6f7400b251 "", packet_length=74, is_com_multi=false, is_next_command=false) at /data/src/10.3/sql/sql_parse.cc:1838
#24 0x000056232262ec3f in do_command (thd=0x7f6f74000b00) at /data/src/10.3/sql/sql_parse.cc:1383
#25 0x00005623227930fd in do_handle_one_connection (connect=0x56232553c9b0) at /data/src/10.3/sql/sql_connect.cc:1402
#26 0x0000562322792e8a in handle_one_connection (arg=0x56232553c9b0) at /data/src/10.3/sql/sql_connect.cc:1308
#27 0x0000562322c20f11 in pfs_spawn_thread (arg=0x562325550600) at /data/src/10.3/storage/perfschema/pfs.cc:1862
#28 0x00007f6f93579494 in start_thread (arg=0x7f6f8c201700) at pthread_create.c:333
#29 0x00007f6f9195f93f in clone () from /lib/x86_64-linux-gnu/libc.so.6



 Comments   
Comment by Rasmus Johansson (Inactive) [ 2018-04-24 ]

Assigning to Sanja because of INTERSECT

Comment by Rasmus Johansson (Inactive) [ 2018-04-24 ]

Lowering priority because of dependency on ha_tokudb

Comment by Oleksandr Byelkin [ 2018-04-24 ]

I can not repeat:
with/without --mem (speed of execution)
with/without including having_tokudb in the test body
--repeat 10000
I have no more ideas

Comment by Elena Stepanova [ 2018-04-24 ]

Try ASAN if it doesn't fail on a regular debug build. ASAN fails readily for me:

10.3 ASAN c39f8a80c9

==8331==ERROR: AddressSanitizer: SEGV on unknown address 0x7fe5857553e0 (pc 0x557b633628e0 sp 0x7fe589bd9af0 bp 0x7fe589bd9b10 T6)
    #0 0x557b633628df in dirname_length /data/src/10.3/mysys/mf_dirname.c:33
    #1 0x557b629f8854 in table_events_waits_common::make_row(bool, PFS_thread*, PFS_events_waits volatile*) /data/src/10.3/storage/perfschema/table_events_waits.cc:429
    #2 0x557b629fa4b8 in table_events_waits_history::rnd_next() /data/src/10.3/storage/perfschema/table_events_waits.cc:883
    #3 0x557b629b857c in ha_perfschema::rnd_next(unsigned char*) /data/src/10.3/storage/perfschema/ha_perfschema.cc:344
    #4 0x557b622c18b7 in handler::ha_rnd_next(unsigned char*) /data/src/10.3/sql/handler.cc:2762
    #5 0x557b622a3e6d in find_all_keys /data/src/10.3/sql/filesort.cc:790
    #6 0x557b622a056d in filesort(THD*, TABLE*, Filesort*, Filesort_tracker*, JOIN*, unsigned long long) /data/src/10.3/sql/filesort.cc:278
    #7 0x557b61cfc435 in create_sort_index(THD*, JOIN*, st_join_table*, Filesort*) /data/src/10.3/sql/sql_select.cc:22322
    #8 0x557b61ced148 in st_join_table::sort_table() /data/src/10.3/sql/sql_select.cc:20157
    #9 0x557b61cec681 in join_init_read_record(st_join_table*) /data/src/10.3/sql/sql_select.cc:20098
    #10 0x557b61ce5da4 in sub_select(JOIN*, st_join_table*, bool) /data/src/10.3/sql/sql_select.cc:19192
    #11 0x557b61ce4279 in do_select /data/src/10.3/sql/sql_select.cc:18735
    #12 0x557b61c81336 in JOIN::exec_inner() /data/src/10.3/sql/sql_select.cc:3953
    #13 0x557b61c7ef73 in JOIN::exec() /data/src/10.3/sql/sql_select.cc:3747
    #14 0x557b61c82473 in mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) /data/src/10.3/sql/sql_select.cc:4152
    #15 0x557b61c5d9dd in handle_select(THD*, LEX*, select_result*, unsigned long) /data/src/10.3/sql/sql_select.cc:382
    #16 0x557b61be07d4 in execute_sqlcom_select /data/src/10.3/sql/sql_parse.cc:6539
    #17 0x557b61bcec4a in mysql_execute_command(THD*) /data/src/10.3/sql/sql_parse.cc:3768
    #18 0x557b61be8d62 in mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) /data/src/10.3/sql/sql_parse.cc:8001
    #19 0x557b61bc3608 in dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) /data/src/10.3/sql/sql_parse.cc:1846
    #20 0x557b61bc069f in do_command(THD*) /data/src/10.3/sql/sql_parse.cc:1391
    #21 0x557b61f2905c in do_handle_one_connection(CONNECT*) /data/src/10.3/sql/sql_connect.cc:1402
    #22 0x557b61f28a71 in handle_one_connection /data/src/10.3/sql/sql_connect.cc:1308
    #23 0x557b62a2d5b9 in pfs_spawn_thread /data/src/10.3/storage/perfschema/pfs.cc:1862
    #24 0x7fe5964f6493 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7493)
    #25 0x7fe5948dc93e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xe893e)
 
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /data/src/10.3/mysys/mf_dirname.c:33 dirname_length
Thread T6 created by T0 here:
    #0 0x7fe59672fbba in pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x23bba)
    #1 0x557b62a2db81 in spawn_thread_v1 /data/src/10.3/storage/perfschema/pfs.cc:1912
    #2 0x557b619356ee in inline_mysql_thread_create /data/src/10.3/include/mysql/psi/mysql_thread.h:1268
    #3 0x557b6194b413 in create_thread_to_handle_connection(CONNECT*) /data/src/10.3/sql/mysqld.cc:6554
    #4 0x557b6194bb18 in create_new_thread /data/src/10.3/sql/mysqld.cc:6624
    #5 0x557b6194cb29 in handle_connections_sockets() /data/src/10.3/sql/mysqld.cc:6899
    #6 0x557b6194a8d0 in mysqld_main(int, char**) /data/src/10.3/sql/mysqld.cc:6176
    #7 0x557b61933a8f in main /data/src/10.3/sql/main.cc:25
    #8 0x7fe5948142b0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202b0)

Comment by Oleksandr Byelkin [ 2018-04-24 ]

Nothing with ASAN also

Comment by Elena Stepanova [ 2018-07-17 ]

Still reproducible for me after several attempts; but here is another test case, which fails reliably for me on debug and release builds (including, for example, the official 10.1.34 bintar):

SELECT * FROM INFORMATION_SCHEMA.ALL_PLUGINS;
SELECT * FROM performance_schema.`events_waits_history_long` UNION ALL SELECT * FROM performance_schema.`events_waits_history_long`;

10.1 e08ddccc35

#3  <signal handler called>
#4  0x0000563d9b791822 in dirname_length (name=0x7f77ac592c00 <error: Cannot access memory at address 0x7f77ac592c00>) at /data/src/10.1/mysys/mf_dirname.c:33
#5  0x0000563d9b3f512d in table_events_waits_common::make_row (this=0x7f77ad05b000, thread_own_wait=false, pfs_thread=0x7f77b574e780, wait=0x7f77b1d79ff8) at /data/src/10.1/storage/perfschema/table_events_waits.cc:429
#6  0x0000563d9b3f612f in table_events_waits_history_long::rnd_next (this=0x7f77ad05b000) at /data/src/10.1/storage/perfschema/table_events_waits.cc:961
#7  0x0000563d9b3d7b69 in ha_perfschema::rnd_next (this=0x7f77ad068488, buf=0x7f77ad0fa088 "\340\063\a") at /data/src/10.1/storage/perfschema/ha_perfschema.cc:359
#8  0x0000563d9b15adcd in handler::ha_rnd_next (this=0x7f77ad068488, buf=0x7f77ad0fa088 "\340\063\a") at /data/src/10.1/sql/handler.cc:2610
#9  0x0000563d9b29c783 in rr_sequential (info=0x7f77ad19a880) at /data/src/10.1/sql/records.cc:471
#10 0x0000563d9af8150f in sub_select (join=0x7f77ad048010, join_tab=0x7f77ad19a7c0, end_of_records=false) at /data/src/10.1/sql/sql_select.cc:18470
#11 0x0000563d9af80cbf in do_select (join=0x7f77ad048010, fields=0x7f77aecd93c0, table=0x0, procedure=0x0) at /data/src/10.1/sql/sql_select.cc:18106
#12 0x0000563d9af5b62a in JOIN::exec_inner (this=0x7f77ad048010) at /data/src/10.1/sql/sql_select.cc:3260
#13 0x0000563d9af588bb in JOIN::exec (this=0x7f77ad048010) at /data/src/10.1/sql/sql_select.cc:2547
#14 0x0000563d9aff16b7 in st_select_lex_unit::exec (this=0x7f77aecd8ba8) at /data/src/10.1/sql/sql_union.cc:847
#15 0x0000563d9afeebf0 in mysql_union (thd=0x7f77aecd5070, lex=0x7f77aecd8ae0, result=0x7f77ad047ec0, unit=0x7f77aecd8ba8, setup_tables_done_option=0) at /data/src/10.1/sql/sql_union.cc:39
#16 0x0000563d9af51619 in handle_select (thd=0x7f77aecd5070, lex=0x7f77aecd8ae0, result=0x7f77ad047ec0, setup_tables_done_option=0) at /data/src/10.1/sql/sql_select.cc:366
#17 0x0000563d9af210f1 in execute_sqlcom_select (thd=0x7f77aecd5070, all_tables=0x7f77ad0433e0) at /data/src/10.1/sql/sql_parse.cc:5944
#18 0x0000563d9af17584 in mysql_execute_command (thd=0x7f77aecd5070) at /data/src/10.1/sql/sql_parse.cc:2990
#19 0x0000563d9af24cc0 in mysql_parse (thd=0x7f77aecd5070, rawbuf=0x7f77ad043088 "SELECT * FROM performance_schema.`events_waits_history_long` UNION ALL SELECT * FROM performance_schema.`events_waits_history_long`", length=131, parser_state=0x7f77b7fb95e0) at /data/src/10.1/sql/sql_parse.cc:7460
#20 0x0000563d9af13655 in dispatch_command (command=COM_QUERY, thd=0x7f77aecd5070, packet=0x7f77b17f9071 "aria/PAGECACHE::cache_lock\023ma_pagecache.c:3036\f141462894852\f141463119476\006\062\062\064\066\062\064\373\373\373\373\373\001\060\002\067\067\005STAGE\004lock\373\373\204", packet_length=131) at /data/src/10.1/sql/sql_parse.cc:1492
#21 0x0000563d9af123da in do_command (thd=0x7f77aecd5070) at /data/src/10.1/sql/sql_parse.cc:1121
#22 0x0000563d9b04c48d in do_handle_one_connection (thd_arg=0x7f77aecd5070) at /data/src/10.1/sql/sql_connect.cc:1330
#23 0x0000563d9b04c1f1 in handle_one_connection (arg=0x7f77aecd5070) at /data/src/10.1/sql/sql_connect.cc:1242
#24 0x0000563d9b4096c6 in pfs_spawn_thread (arg=0x7f77b5439ef0) at /data/src/10.1/storage/perfschema/pfs.cc:1861
#25 0x00007f77b7c39494 in start_thread (arg=0x7f77b7fbab00) at pthread_create.c:333
#26 0x00007f77b5ff293f in clone () from /lib/x86_64-linux-gnu/libc.so.6

Comment by Elena Stepanova [ 2023-06-08 ]

Now I cannot reproduce it anymore either, even on old releases. Let's close it and see if it pops up in tests again.

Generated at Thu Feb 08 08:21:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.