Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12972

Random and Frequent Segfault (SIG 11) During Runtime

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.1.23, 10.0(EOL), 10.1(EOL), 10.2(EOL)
    • 10.1.26, 10.0.32, 10.2.8
    • Optimizer
    • None
    • CentOS7, Ubuntu 16.04

    Description

      Over the past few weeks I've had three different servers with MariaDB 10.1.20 through 10.1.23 inclusive all crash seemingly at random with a Signal 11.

      No common time of day, no common operation other than operating on data stored in InnoDB tables.

      Attached is the GDB trace from the core dump I managed to obtain as well as configs.

      I can make the core dump available if needs be however it is over 200GB uncompressed, approx 50GB with gzip.

      The overall configuration is a single read/write master with slaves attached. One of the slaves is used for SELECT statements by the same application while the other is a dormant stand by.

      The crash happens on three separate servers of identical hardware. CentOS7 and Ubuntu 16.04 have both been the running OS while the crash has occurred. The crash has only occurred while the server has been operating as the master with slaves attached, as yet we've not seen an active read slave or dormant slave exhibit the same crash. Potential concurrency issue?

      We DO have a vast number of tables in the DB, close to 700k, approx 400k of those active during a normal business day. table_open_cache is currently set at 524288 because we've found if we don't set it/set it low, we get locked up waiting for mysql to look through the table cache for an evict-able table before it gives up and adds a new table_cache_entry anyway.

      Also attached is the my.cnf and the query/query plan for the query in the stack trace for the thread that segfaulted, though we've seen it happen on optimize table statements before now.

      On the current active write master I've set the optimizer_switch to default as I see MRR involved in the thread that died, something we turned on at some point and I note in MariaDB defaults, it's switched off.

      Attachments

        1. example-query.sql
          3 kB
        2. mariadb-debug.out
          329 kB
        3. my.cnf
          5 kB
        4. _usr_sbin_mysqld.112.crash
          55 kB
        5. show-create-index.txt
          17 kB
        6. stack1
          111 kB

        Issue Links

          Activity

            Ninpo Alex Boag-Munroe created issue -

            I took a look at the stack trace in mariadb-debug.out, but I couldn't find out the thread that caused the crash.
            Can you please also post the *.err file, as this may give us the function where the crash happend.

            This shouldn't have anything to do with number of tables.

            I noticed that a lot of threads was locked inside of jemalloc. This may be normal, but I haven't noticed this before.

            I am happy to take a look at the core dump, but for that to be useful, I need also the mysqld executable and preferable be able to look at it at a similar system that you are using.

            What is your exact OS version and is it possible to get remote access to be able to access a copy of the mysqld executable and the core? If yes, then please ping me on IRC (you know my nick #montywi)

            monty Michael Widenius added a comment - I took a look at the stack trace in mariadb-debug.out, but I couldn't find out the thread that caused the crash. Can you please also post the *.err file, as this may give us the function where the crash happend. This shouldn't have anything to do with number of tables. I noticed that a lot of threads was locked inside of jemalloc. This may be normal, but I haven't noticed this before. I am happy to take a look at the core dump, but for that to be useful, I need also the mysqld executable and preferable be able to look at it at a similar system that you are using. What is your exact OS version and is it possible to get remote access to be able to access a copy of the mysqld executable and the core? If yes, then please ping me on IRC (you know my nick #montywi)

            monty,

            Apparently here is the crashing thread (from mariadb-debug.out):

            Thread 236 (Thread 0x7ef0555a2b00 (LWP 1036)):
            #0  0x00007f2654af1bc8 in pthread_mutex_unlock () from /lib64/libpthread.so.0
            #1  0x00007f2654d20035 in je_malloc_mutex_unlock (mutex=<optimized out>) at include/jemalloc/internal/mutex.h:92
            #2  je_malloc_mutex_postfork_parent (mutex=<optimized out>) at src/mutex.c:116
            #3  0x00007f2654d13a31 in je_arena_postfork_parent (arena=0x7f26500000c0) at src/arena.c:2565
            #4  0x00007f2654d071c7 in je_jemalloc_postfork_parent () at src/jemalloc.c:2032
            #5  0x00007f26531088cf in fork () from /lib64/libc.so.6
            #6  0x00007f2655c01226 in my_addr_resolve_init () at /home/buildbot/buildbot/build/mariadb-10.1.23/mysys/my_addr_resolve.c:239
            #7  0x00007f2655bed4ec in print_with_addr_resolve (n=37, addrs=0x7ef05559dd90) at /home/buildbot/buildbot/build/mariadb-10.1.23/mysys/stacktrace.c:244
            #8  my_print_stacktrace (stack_bottom=0x7ef0555a20b0 "\b\340\347\332\356~", thread_stack=295936) at /home/buildbot/buildbot/build/mariadb-10.1.23/mysys/stacktrace.c:271
            #9  0x00007f2655710345 in handle_fatal_signal (sig=11) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/signal_handler.cc:166
            #10 <signal handler called>
            #11 0x00007f265317d586 in __strcmp_sse42 () from /lib64/libc.so.6
            #12 0x00007f265589562f in get_share (table_name=0x7ef4af552128 "./jb_thebristolresidentiallettingcoltd/lettings_statement") at /home/buildbot/buildbot/build/mariadb-10.1.23/storage/xtradb/handler/ha_innodb.cc:16155
            #13 ha_innobase::open (this=0x7ef5675a1490, name=0x7ef4af552128 "./jb_thebristolresidentiallettingcoltd/lettings_statement", mode=<optimized out>, test_if_locked=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/storage/xtradb/handler/ha_innodb.cc:6310
            #14 0x00007f2655714663 in handler::ha_open (this=this@entry=0x7ef5675a1490, table_arg=<optimized out>, name=name@entry=0x7ef4af552128 "./jb_thebristolresidentiallettingcoltd/lettings_statement", mode=39, test_if_locked=test_if_locked@entry=2) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/handler.cc:2501
            #15 0x00007f2655714980 in handler::clone (this=this@entry=0x7ef6b220b820, name=0x7ef4af552128 "./jb_thebristolresidentiallettingcoltd/lettings_statement", mem_root=0x7eeedae82c98) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/handler.cc:2413
            #16 0x00007f2655877381 in ha_innobase::clone (this=0x7ef6b220b820, name=<optimized out>, mem_root=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/storage/xtradb/handler/ha_innodb.cc:6730
            #17 0x00007f265568b6bd in DsMrr_impl::setup_two_handlers (this=this@entry=0x7ef6b220bce8) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/multi_range_read.cc:1054
            #18 0x00007f265568bed8 in DsMrr_impl::dsmrr_init (this=0x7ef6b220bce8, h_arg=<optimized out>, seq_funcs=0x7ef4087d8918, seq_init_param=0x7ef4087d8728, n_ranges=1, mode=1665, buf=0x7ef4087d8948) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/multi_range_read.cc:951
            #19 0x00007f2655877afa in ha_innobase::multi_range_read_init (this=<optimized out>, seq=<optimized out>, seq_init_param=<optimized out>, n_ranges=<optimized out>, mode=<optimized out>, buf=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/storage/xtradb/handler/ha_innodb.cc:22043
            #20 0x00007f2655687076 in JOIN_CACHE::join_matching_records (this=0x7ef4087d8728, skip_last=false) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_join_cache.cc:2260
            #21 0x00007f2655686b84 in JOIN_CACHE::join_records (this=0x7ef4087d8728, skip_last=skip_last@entry=false) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_join_cache.cc:2097
            #22 0x00007f2655686bc4 in JOIN_CACHE::join_records (this=0x7ef4087d8460, skip_last=skip_last@entry=false) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_join_cache.cc:2143
            #23 0x00007f2655686bc4 in JOIN_CACHE::join_records (this=this@entry=0x7ef4087d7f20, skip_last=skip_last@entry=false) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_join_cache.cc:2143
            #24 0x00007f26555c44ea in sub_select_cache (join=0x7ef777aa2888, join_tab=0x7ef4087d6050, end_of_records=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:18104
            #25 0x00007f26555d2d99 in do_select (join=join@entry=0x7ef777aa2888, fields=fields@entry=0x0, table=table@entry=0x7ef8bd4d3020, procedure=procedure@entry=0x0) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:17980
            #26 0x00007f26555e4dd6 in JOIN::exec_inner (this=this@entry=0x7ef777aa2888) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:2765
            #27 0x00007f26555e7104 in JOIN::exec (this=this@entry=0x7ef777aa2888) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:2511
            #28 0x00007f26555e37a2 in mysql_select (thd=thd@entry=0x7eeedae7e008, rref_pointer_array=rref_pointer_array@entry=0x7ef4e5085750, tables=<optimized out>, wild_num=<optimized out>, fields=..., conds=<optimized out>, og_num=4, order=0x7ef374d8e978, group=0x0, having=0x0, proc_param=proc_param@entry=0x0, select_options=2416184065, 
                result=result@entry=0x7ef777aa2790, unit=unit@entry=0x7ef4e5085838, select_lex=select_lex@entry=0x7ef4e50854d8) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:3445
            #29 0x00007f2655568b9b in mysql_derived_fill (thd=0x7eeedae7e008, lex=0x7eeedae81848, derived=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_derived.cc:942
            #30 0x00007f2655569eb4 in mysql_handle_single_derived (lex=0x7eeedae81848, derived=derived@entry=0x7ef374d8f628, phases=phases@entry=96) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_derived.cc:195
            #31 0x00007f26555c3dbf in st_join_table::preread_init (this=0x7ef56759f850) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:11713
            #32 0x00007f26555dd078 in create_sort_index (thd=0x7eeedae7e008, join=join@entry=0x7ef777aa2210, order=0x7ef374d8fd50, filesort_limit=18446744073709551615, select_limit=<optimized out>, is_order_by=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:21453
            #33 0x00007f26555e5017 in JOIN::exec_inner (this=this@entry=0x7ef777aa2210) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:3191
            #34 0x00007f26555e7104 in JOIN::exec (this=this@entry=0x7ef777aa2210) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:2511
            #35 0x00007f26555e37a2 in mysql_select (thd=thd@entry=0x7eeedae7e008, rref_pointer_array=rref_pointer_array@entry=0x7eeedae82288, tables=<optimized out>, wild_num=<optimized out>, fields=..., conds=<optimized out>, og_num=3, order=0x7ef374d8fd50, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, 
                result=result@entry=0x7ef374d8ffb8, unit=unit@entry=0x7eeedae81910, select_lex=select_lex@entry=0x7eeedae82010) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:3445
            #36 0x00007f26555e4285 in handle_select (thd=thd@entry=0x7eeedae7e008, lex=lex@entry=0x7eeedae81848, result=result@entry=0x7ef374d8ffb8, setup_tables_done_option=setup_tables_done_option@entry=0) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:384
            #37 0x00007f26555835c1 in execute_sqlcom_select (thd=thd@entry=0x7eeedae7e008, all_tables=0x7ef374d8f628) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:5920
            #38 0x00007f265558f9db in mysql_execute_command (thd=thd@entry=0x7eeedae7e008) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:2979
            #39 0x00007f2655592fb2 in mysql_parse (thd=0x7eeedae7e008, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:7343
            #40 0x00007f265559648a in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7eeedae7e008, 
                packet=packet@entry=0x7ef9fa83100d "\n\t\tSELECT * FROM (\n\t\tSELECT DISTINCT\n\t\t\t`lettings_ledger`.`ledgerID`, \n\t\t\t`lettings_ledger`.`transactionID`, \n\t\t\t`lettings_ledger`.`activityID`, \n\t\t\t`lettings_ledger`.`transactionType`,\n\t\t\t`lettings_ledger`.`propertyID`,\n\t\t\t`lettings_ledger`.`tenancyID`, \n\t\t\t`lettings_ledger`.`ledgerType`,\n\t\t\t`lettings_ledger`.`itemDescription`, `lettings_ledger`.`debitAmount`, \n\t\t\t\t\t`lettings_ledger`.`creditAmount`, `lettings_ledger`.`documentType`, \n\t\t\t`lettings_ledger`.`documentNumber`, \n\t\t\t`lettings_ledger`.`statementNumber`, \n\t\t\t`lettings_ledger`.`dateCreated`, \n\t\t\t`lettings_ledger`.`linkTable`, \n\t\t\t`lettings_ledger`.`linkField`,\n\t\t\t`lettings_ledger`.`linkID`, \n\t\t\t`lettings_statement`.`propertyStatementNumber`, \n\t\t\t`user`.`displayName`,\n\t\t\t`property`.`addressName`,\n\t\t\t`property`.`addressNumber`, \n\t\t\t`property`.`addressStreet`, \n\t\t\t`property`.`address2`, \n\t\t\t`property`.`address3`, \n\t\t\t`property`.`address4`, \n\t\t\t`property`.`addressPostcode`  \n\t\tFROM \n\t\t\t`lettings_ledger` \n\t\tINNER JOIN \n\t\t\t`user` ON `user`.`userID` = `lettings_ledger`.`recordCreatedBy` \n\t\tLEFT JOIN \n\t\t\t`property` ON `property`.`propertyID` = `lettings_ledger`.`propertyID` \n\t\tLEFT JOIN \n\t\t\t`lettings_statement` ON `lettings_ledger`.`statementNumber` = `lettings_statement`.`statementNumber` \n\t\t\n\t\t\tWHERE \n\t\t\t\t`lettings_ledger`.`ledgerType` = 'tenant' AND \n\t\t\t\t`lettings_ledger`.`tenancyID` = '4679' ORDER BY \n\t\t\t`lettings_ledger`.`dateCreated` DESC, \n\t\t\t`lettings_ledger`.`ledgerID` DESC, \n\t\t\t`lettings_ledger`.`activityID` DESC, \n\t\t\t`lettings_ledger`.`transactionID` DESC LIMIT 5) AS `test` \n\t\tORDER BY \n\t\t\t`dateCreated`, \n\t\t\t`activityID`, \n\t\t\t`transactionID`", packet_length=packet_length@entry=1629) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:1490
            #41 0x00007f2655596cea in do_command (thd=0x7eeedae7e008) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:1109
            #42 0x00007f265565eb6a in do_handle_one_connection (thd_arg=thd_arg@entry=0x7ef7428fb008) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_connect.cc:1349
            #43 0x00007f265565ed10 in handle_one_connection (arg=0x7ef7428fb008) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_connect.cc:1261
            #44 0x00007f2654aeedc5 in start_thread () from /lib64/libpthread.so.0
            #45 0x00007f265314173d in clone () from /lib64/libc.so.6
            

            elenst Elena Stepanova added a comment - monty , Apparently here is the crashing thread (from mariadb-debug.out): Thread 236 (Thread 0x7ef0555a2b00 (LWP 1036)): #0 0x00007f2654af1bc8 in pthread_mutex_unlock () from /lib64/libpthread.so.0 #1 0x00007f2654d20035 in je_malloc_mutex_unlock (mutex=<optimized out>) at include/jemalloc/internal/mutex.h:92 #2 je_malloc_mutex_postfork_parent (mutex=<optimized out>) at src/mutex.c:116 #3 0x00007f2654d13a31 in je_arena_postfork_parent (arena=0x7f26500000c0) at src/arena.c:2565 #4 0x00007f2654d071c7 in je_jemalloc_postfork_parent () at src/jemalloc.c:2032 #5 0x00007f26531088cf in fork () from /lib64/libc.so.6 #6 0x00007f2655c01226 in my_addr_resolve_init () at /home/buildbot/buildbot/build/mariadb-10.1.23/mysys/my_addr_resolve.c:239 #7 0x00007f2655bed4ec in print_with_addr_resolve (n=37, addrs=0x7ef05559dd90) at /home/buildbot/buildbot/build/mariadb-10.1.23/mysys/stacktrace.c:244 #8 my_print_stacktrace (stack_bottom=0x7ef0555a20b0 "\b\340\347\332\356~", thread_stack=295936) at /home/buildbot/buildbot/build/mariadb-10.1.23/mysys/stacktrace.c:271 #9 0x00007f2655710345 in handle_fatal_signal (sig=11) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/signal_handler.cc:166 #10 <signal handler called> #11 0x00007f265317d586 in __strcmp_sse42 () from /lib64/libc.so.6 #12 0x00007f265589562f in get_share (table_name=0x7ef4af552128 "./jb_thebristolresidentiallettingcoltd/lettings_statement") at /home/buildbot/buildbot/build/mariadb-10.1.23/storage/xtradb/handler/ha_innodb.cc:16155 #13 ha_innobase::open (this=0x7ef5675a1490, name=0x7ef4af552128 "./jb_thebristolresidentiallettingcoltd/lettings_statement", mode=<optimized out>, test_if_locked=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/storage/xtradb/handler/ha_innodb.cc:6310 #14 0x00007f2655714663 in handler::ha_open (this=this@entry=0x7ef5675a1490, table_arg=<optimized out>, name=name@entry=0x7ef4af552128 "./jb_thebristolresidentiallettingcoltd/lettings_statement", mode=39, test_if_locked=test_if_locked@entry=2) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/handler.cc:2501 #15 0x00007f2655714980 in handler::clone (this=this@entry=0x7ef6b220b820, name=0x7ef4af552128 "./jb_thebristolresidentiallettingcoltd/lettings_statement", mem_root=0x7eeedae82c98) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/handler.cc:2413 #16 0x00007f2655877381 in ha_innobase::clone (this=0x7ef6b220b820, name=<optimized out>, mem_root=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/storage/xtradb/handler/ha_innodb.cc:6730 #17 0x00007f265568b6bd in DsMrr_impl::setup_two_handlers (this=this@entry=0x7ef6b220bce8) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/multi_range_read.cc:1054 #18 0x00007f265568bed8 in DsMrr_impl::dsmrr_init (this=0x7ef6b220bce8, h_arg=<optimized out>, seq_funcs=0x7ef4087d8918, seq_init_param=0x7ef4087d8728, n_ranges=1, mode=1665, buf=0x7ef4087d8948) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/multi_range_read.cc:951 #19 0x00007f2655877afa in ha_innobase::multi_range_read_init (this=<optimized out>, seq=<optimized out>, seq_init_param=<optimized out>, n_ranges=<optimized out>, mode=<optimized out>, buf=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/storage/xtradb/handler/ha_innodb.cc:22043 #20 0x00007f2655687076 in JOIN_CACHE::join_matching_records (this=0x7ef4087d8728, skip_last=false) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_join_cache.cc:2260 #21 0x00007f2655686b84 in JOIN_CACHE::join_records (this=0x7ef4087d8728, skip_last=skip_last@entry=false) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_join_cache.cc:2097 #22 0x00007f2655686bc4 in JOIN_CACHE::join_records (this=0x7ef4087d8460, skip_last=skip_last@entry=false) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_join_cache.cc:2143 #23 0x00007f2655686bc4 in JOIN_CACHE::join_records (this=this@entry=0x7ef4087d7f20, skip_last=skip_last@entry=false) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_join_cache.cc:2143 #24 0x00007f26555c44ea in sub_select_cache (join=0x7ef777aa2888, join_tab=0x7ef4087d6050, end_of_records=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:18104 #25 0x00007f26555d2d99 in do_select (join=join@entry=0x7ef777aa2888, fields=fields@entry=0x0, table=table@entry=0x7ef8bd4d3020, procedure=procedure@entry=0x0) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:17980 #26 0x00007f26555e4dd6 in JOIN::exec_inner (this=this@entry=0x7ef777aa2888) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:2765 #27 0x00007f26555e7104 in JOIN::exec (this=this@entry=0x7ef777aa2888) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:2511 #28 0x00007f26555e37a2 in mysql_select (thd=thd@entry=0x7eeedae7e008, rref_pointer_array=rref_pointer_array@entry=0x7ef4e5085750, tables=<optimized out>, wild_num=<optimized out>, fields=..., conds=<optimized out>, og_num=4, order=0x7ef374d8e978, group=0x0, having=0x0, proc_param=proc_param@entry=0x0, select_options=2416184065, result=result@entry=0x7ef777aa2790, unit=unit@entry=0x7ef4e5085838, select_lex=select_lex@entry=0x7ef4e50854d8) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:3445 #29 0x00007f2655568b9b in mysql_derived_fill (thd=0x7eeedae7e008, lex=0x7eeedae81848, derived=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_derived.cc:942 #30 0x00007f2655569eb4 in mysql_handle_single_derived (lex=0x7eeedae81848, derived=derived@entry=0x7ef374d8f628, phases=phases@entry=96) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_derived.cc:195 #31 0x00007f26555c3dbf in st_join_table::preread_init (this=0x7ef56759f850) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:11713 #32 0x00007f26555dd078 in create_sort_index (thd=0x7eeedae7e008, join=join@entry=0x7ef777aa2210, order=0x7ef374d8fd50, filesort_limit=18446744073709551615, select_limit=<optimized out>, is_order_by=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:21453 #33 0x00007f26555e5017 in JOIN::exec_inner (this=this@entry=0x7ef777aa2210) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:3191 #34 0x00007f26555e7104 in JOIN::exec (this=this@entry=0x7ef777aa2210) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:2511 #35 0x00007f26555e37a2 in mysql_select (thd=thd@entry=0x7eeedae7e008, rref_pointer_array=rref_pointer_array@entry=0x7eeedae82288, tables=<optimized out>, wild_num=<optimized out>, fields=..., conds=<optimized out>, og_num=3, order=0x7ef374d8fd50, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=result@entry=0x7ef374d8ffb8, unit=unit@entry=0x7eeedae81910, select_lex=select_lex@entry=0x7eeedae82010) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:3445 #36 0x00007f26555e4285 in handle_select (thd=thd@entry=0x7eeedae7e008, lex=lex@entry=0x7eeedae81848, result=result@entry=0x7ef374d8ffb8, setup_tables_done_option=setup_tables_done_option@entry=0) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_select.cc:384 #37 0x00007f26555835c1 in execute_sqlcom_select (thd=thd@entry=0x7eeedae7e008, all_tables=0x7ef374d8f628) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:5920 #38 0x00007f265558f9db in mysql_execute_command (thd=thd@entry=0x7eeedae7e008) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:2979 #39 0x00007f2655592fb2 in mysql_parse (thd=0x7eeedae7e008, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:7343 #40 0x00007f265559648a in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7eeedae7e008, packet=packet@entry=0x7ef9fa83100d "\n\t\tSELECT * FROM (\n\t\tSELECT DISTINCT\n\t\t\t`lettings_ledger`.`ledgerID`, \n\t\t\t`lettings_ledger`.`transactionID`, \n\t\t\t`lettings_ledger`.`activityID`, \n\t\t\t`lettings_ledger`.`transactionType`,\n\t\t\t`lettings_ledger`.`propertyID`,\n\t\t\t`lettings_ledger`.`tenancyID`, \n\t\t\t`lettings_ledger`.`ledgerType`,\n\t\t\t`lettings_ledger`.`itemDescription`, `lettings_ledger`.`debitAmount`, \n\t\t\t\t\t`lettings_ledger`.`creditAmount`, `lettings_ledger`.`documentType`, \n\t\t\t`lettings_ledger`.`documentNumber`, \n\t\t\t`lettings_ledger`.`statementNumber`, \n\t\t\t`lettings_ledger`.`dateCreated`, \n\t\t\t`lettings_ledger`.`linkTable`, \n\t\t\t`lettings_ledger`.`linkField`,\n\t\t\t`lettings_ledger`.`linkID`, \n\t\t\t`lettings_statement`.`propertyStatementNumber`, \n\t\t\t`user`.`displayName`,\n\t\t\t`property`.`addressName`,\n\t\t\t`property`.`addressNumber`, \n\t\t\t`property`.`addressStreet`, \n\t\t\t`property`.`address2`, \n\t\t\t`property`.`address3`, \n\t\t\t`property`.`address4`, \n\t\t\t`property`.`addressPostcode` \n\t\tFROM \n\t\t\t`lettings_ledger` \n\t\tINNER JOIN \n\t\t\t`user` ON `user`.`userID` = `lettings_ledger`.`recordCreatedBy` \n\t\tLEFT JOIN \n\t\t\t`property` ON `property`.`propertyID` = `lettings_ledger`.`propertyID` \n\t\tLEFT JOIN \n\t\t\t`lettings_statement` ON `lettings_ledger`.`statementNumber` = `lettings_statement`.`statementNumber` \n\t\t\n\t\t\tWHERE \n\t\t\t\t`lettings_ledger`.`ledgerType` = 'tenant' AND \n\t\t\t\t`lettings_ledger`.`tenancyID` = '4679' ORDER BY \n\t\t\t`lettings_ledger`.`dateCreated` DESC, \n\t\t\t`lettings_ledger`.`ledgerID` DESC, \n\t\t\t`lettings_ledger`.`activityID` DESC, \n\t\t\t`lettings_ledger`.`transactionID` DESC LIMIT 5) AS `test` \n\t\tORDER BY \n\t\t\t`dateCreated`, \n\t\t\t`activityID`, \n\t\t\t`transactionID`", packet_length=packet_length@entry=1629) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:1490 #41 0x00007f2655596cea in do_command (thd=0x7eeedae7e008) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_parse.cc:1109 #42 0x00007f265565eb6a in do_handle_one_connection (thd_arg=thd_arg@entry=0x7ef7428fb008) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_connect.cc:1349 #43 0x00007f265565ed10 in handle_one_connection (arg=0x7ef7428fb008) at /home/buildbot/buildbot/build/mariadb-10.1.23/sql/sql_connect.cc:1261 #44 0x00007f2654aeedc5 in start_thread () from /lib64/libpthread.so.0 #45 0x00007f265314173d in clone () from /lib64/libc.so.6
            elenst Elena Stepanova added a comment - - edited

            Ninpo,

            Could you please provide SHOW CREATE TABLE and SHOW INDEX IN for each of lettings_ledger, user, property, lettings_statement tables?
            If any of them are views, please do so for underlying tables as well.

            Thanks.

            elenst Elena Stepanova added a comment - - edited Ninpo , Could you please provide SHOW CREATE TABLE and SHOW INDEX IN for each of lettings_ledger, user, property, lettings_statement tables? If any of them are views, please do so for underlying tables as well. Thanks.
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Labels need_feedback

            Without the information requested above and with purely artificial data/structures instead, I couldn't get it with XtraDB, but I got a very similar crash with TokuDB:

            10.1 f58142f644cd773c0eb02ba95920ca7ac55799d3

            #3  <signal handler called>
            #4  memset () at ../sysdeps/x86_64/memset.S:93
            #5  0x00007fabd09c6389 in Lifo_buffer::set_buffer_space (this=0x7fabc56656d8, start_arg=0x7fabc55e93d3 '\217' <repeats 200 times>..., end_arg=0x7fabc55e93cc '\217' <repeats 200 times>...) at /data/src/10.1/sql/sql_lifo_buffer.h:87
            #6  0x00007fabd09c4d8e in DsMrr_impl::setup_buffer_sharing (this=0x7fabc5665508, key_size_in_keybuf=4, key_tuple_map=1) at /data/src/10.1/sql/multi_range_read.cc:1244
            #7  0x00007fabd09c40e7 in DsMrr_impl::dsmrr_init (this=0x7fabc5665508, h_arg=0x7fabc5665088, seq_funcs=0x7fabac05e0a8, seq_init_param=0x7fabac05deb8, n_ranges=14, mode=1665, buf=0x7fabac05e0d8) at /data/src/10.1/sql/multi_range_read.cc:936
            #8  0x00007fabc484360c in ha_tokudb::multi_range_read_init (this=0x7fabc5665088, seq=0x7fabac05e0a8, seq_init_param=0x7fabac05deb8, n_ranges=14, mode=1665, buf=0x7fabac05e0d8) at /data/src/10.1/storage/tokudb/ha_tokudb_mrr_maria.cc:37
            #9  0x00007fabd09bd591 in JOIN_TAB_SCAN_MRR::open (this=0x7fabac05e080) at /data/src/10.1/sql/sql_join_cache.cc:3894
            #10 0x00007fabd09bb00e in JOIN_CACHE::join_matching_records (this=0x7fabac05deb8, skip_last=false) at /data/src/10.1/sql/sql_join_cache.cc:2260
            #11 0x00007fabd09babb6 in JOIN_CACHE::join_records (this=0x7fabac05deb8, skip_last=false) at /data/src/10.1/sql/sql_join_cache.cc:2097
            #12 0x00007fabd08b0a52 in sub_select_cache (join=0x7fabc5448210, join_tab=0x7fabac05cb68, end_of_records=false) at /data/src/10.1/sql/sql_select.cc:18200
            #13 0x00007fabd09bb461 in JOIN_CACHE::generate_full_extensions (this=0x7fabac05dbf8, rec_ptr=0x7fabc559c800 "") at /data/src/10.1/sql/sql_join_cache.cc:2410
            #14 0x00007fabd09bb1a4 in JOIN_CACHE::join_matching_records (this=0x7fabac05dbf8, skip_last=false) at /data/src/10.1/sql/sql_join_cache.cc:2302
            #15 0x00007fabd09babb6 in JOIN_CACHE::join_records (this=0x7fabac05dbf8, skip_last=false) at /data/src/10.1/sql/sql_join_cache.cc:2097
            #16 0x00007fabd08b098d in sub_select_cache (join=0x7fabc5448210, join_tab=0x7fabac05c820, end_of_records=true) at /data/src/10.1/sql/sql_select.cc:18180
            #17 0x00007fabd08b0b77 in sub_select (join=0x7fabc5448210, join_tab=0x7fabac05c4d8, end_of_records=true) at /data/src/10.1/sql/sql_select.cc:18350
            #18 0x00007fabd08b064f in do_select (join=0x7fabc5448210, fields=0x7fabc5443ad0, table=0x0, procedure=0x0) at /data/src/10.1/sql/sql_select.cc:18056
            #19 0x00007fabd088b070 in JOIN::exec_inner (this=0x7fabc5448210) at /data/src/10.1/sql/sql_select.cc:3225
            #20 0x00007fabd0888301 in JOIN::exec (this=0x7fabc5448210) at /data/src/10.1/sql/sql_select.cc:2512
            #21 0x00007fabd0b2eef6 in subselect_single_select_engine::exec (this=0x7fabc5447578) at /data/src/10.1/sql/item_subselect.cc:3840
            #22 0x00007fabd0b24e5b in Item_subselect::exec (this=0x7fabc5447428) at /data/src/10.1/sql/item_subselect.cc:710
            #23 0x00007fabd0b277b8 in Item_exists_subselect::val_int (this=0x7fabc5447428) at /data/src/10.1/sql/item_subselect.cc:1582
            #24 0x00007fabd0792d09 in Item::val_int_result (this=0x7fabc5447428) at /data/src/10.1/sql/item.h:1034
            #25 0x00007fabd0aab36f in Item_cache_int::cache_value (this=0x7fabac060138) at /data/src/10.1/sql/item.cc:8891
            #26 0x00007fabd0ab2eca in Item_cache_wrapper::cache (this=0x7fabac060088) at /data/src/10.1/sql/item.cc:7738
            #27 0x00007fabd0aa7fb6 in Item_cache_wrapper::val_int (this=0x7fabac060088) at /data/src/10.1/sql/item.cc:7792
            #28 0x00007fabd0ab90bf in Item_in_optimizer::val_int (this=0x7fabac010170) at /data/src/10.1/sql/item_cmpfunc.cc:1552
            #29 0x00007fabd08b1175 in evaluate_join_record (join=0x7fabc54476f8, join_tab=0x7fabac011380, error=0) at /data/src/10.1/sql/sql_select.cc:18497
            #30 0x00007fabd08b0d9f in sub_select (join=0x7fabc54476f8, join_tab=0x7fabac011380, end_of_records=false) at /data/src/10.1/sql/sql_select.cc:18398
            #31 0x00007fabd08b0602 in do_select (join=0x7fabc54476f8, fields=0x7fabc73fd2d0, table=0x0, procedure=0x0) at /data/src/10.1/sql/sql_select.cc:18053
            #32 0x00007fabd088b070 in JOIN::exec_inner (this=0x7fabc54476f8) at /data/src/10.1/sql/sql_select.cc:3225
            #33 0x00007fabd0888301 in JOIN::exec (this=0x7fabc54476f8) at /data/src/10.1/sql/sql_select.cc:2512
            #34 0x00007fabd088b897 in mysql_select (thd=0x7fabc73f9070, rref_pointer_array=0x7fabc73fd448, tables=0x7fabc54433b0, wild_num=1, fields=..., conds=0x7fabc5447428, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fabc54476d8, unit=0x7fabc73fcab8, select_lex=0x7fabc73fd1b8) at /data/src/10.1/sql/sql_select.cc:3449
            #35 0x00007fabd08812bc in handle_select (thd=0x7fabc73f9070, lex=0x7fabc73fc9f0, result=0x7fabc54476d8, setup_tables_done_option=0) at /data/src/10.1/sql/sql_select.cc:384
            #36 0x00007fabd0851166 in execute_sqlcom_select (thd=0x7fabc73f9070, all_tables=0x7fabc54433b0) at /data/src/10.1/sql/sql_parse.cc:5923
            #37 0x00007fabd0847662 in mysql_execute_command (thd=0x7fabc73f9070) at /data/src/10.1/sql/sql_parse.cc:2980
            #38 0x00007fabd0854871 in mysql_parse (thd=0x7fabc73f9070, rawbuf=0x7fabc5443088 "SELECT * FROM t1 AS t1_outer WHERE EXISTS ( SELECT * FROM t2 WHERE i2 IN ( SELECT i3 FROM t3 INNER JOIN t1 AS t1_inner ON (t1_inner.c1 = c3 ) WHERE t1_inner.i1 < t1_outer.i1 ) )", length=177, parser_state=0x7fabd175f5e0) at /data/src/10.1/sql/sql_parse.cc:7339
            #39 0x00007fabd084385c in dispatch_command (command=COM_QUERY, thd=0x7fabc73f9070, packet=0x7fabc8d3e071 "", packet_length=177) at /data/src/10.1/sql/sql_parse.cc:1490
            #40 0x00007fabd08425bb in do_command (thd=0x7fabc73f9070) at /data/src/10.1/sql/sql_parse.cc:1109
            #41 0x00007fabd097a125 in do_handle_one_connection (thd_arg=0x7fabc73f9070) at /data/src/10.1/sql/sql_connect.cc:1349
            #42 0x00007fabd0979e89 in handle_one_connection (arg=0x7fabc73f9070) at /data/src/10.1/sql/sql_connect.cc:1261
            #43 0x00007fabd0c5e1bc in pfs_spawn_thread (arg=0x7fabcd827ef0) at /data/src/10.1/storage/perfschema/pfs.cc:1860
            #44 0x00007fabcff3e064 in start_thread (arg=0x7fabd1760b00) at pthread_create.c:309
            #45 0x00007fabce0f662d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
            

            INSTALL SONAME 'ha_tokudb';
             
            CREATE TABLE t1 (pk INT PRIMARY KEY, i1 INT, c1 VARCHAR(1), KEY(c1)) ENGINE=TokuDB;
            INSERT INTO t1 VALUES 
            (1,2,NULL),(2,5,'x'),(3,3,'i'),(4,1,'e'),(5,4,'p');
             
            CREATE TABLE t2 (i2 INT) ENGINE=TokuDB;
            INSERT INTO t2 VALUES (1),(2);
             
            CREATE TABLE t3 (i3 INT, c3 VARCHAR(1), KEY(i3)) ENGINE=TokuDB;
            INSERT INTO t3 VALUES 
            (1,'e'),(1,'z'),(1,'i'),(1,'q'),(1,'i'),(1,'f'),(1,'m'),(2,'c'),(1,'d'),
            (2,'n'),(1,'t'),(2,'e'),(1,'w'),(2,'y'),(1,'j'),(2,'i'),(1,'f'),(2,'f'),
            (1,'s'),(2,'y');
             
            SET join_cache_level = 8;
            SET optimizer_switch ='mrr=on,mrr_sort_keys=on,optimize_join_buffer_size=on';
             
            SELECT * FROM t1 AS t1_outer WHERE EXISTS ( SELECT * FROM t2 WHERE i2 IN ( SELECT i3 FROM t3 INNER JOIN t1 AS t1_inner ON (t1_inner.c1 = c3 ) WHERE t1_inner.i1 < t1_outer.i1 ) );
             
            DROP TABLE t1, t2, t3;
            UNINSTALL SONAME 'ha_tokudb';
            

            Also reproducible on 10.0 and 10.2. Didn't get the crash on 5.5, but with valgrind it also misbehaves.

            elenst Elena Stepanova added a comment - Without the information requested above and with purely artificial data/structures instead, I couldn't get it with XtraDB, but I got a very similar crash with TokuDB: 10.1 f58142f644cd773c0eb02ba95920ca7ac55799d3 #3 <signal handler called> #4 memset () at ../sysdeps/x86_64/memset.S:93 #5 0x00007fabd09c6389 in Lifo_buffer::set_buffer_space (this=0x7fabc56656d8, start_arg=0x7fabc55e93d3 '\217' <repeats 200 times>..., end_arg=0x7fabc55e93cc '\217' <repeats 200 times>...) at /data/src/10.1/sql/sql_lifo_buffer.h:87 #6 0x00007fabd09c4d8e in DsMrr_impl::setup_buffer_sharing (this=0x7fabc5665508, key_size_in_keybuf=4, key_tuple_map=1) at /data/src/10.1/sql/multi_range_read.cc:1244 #7 0x00007fabd09c40e7 in DsMrr_impl::dsmrr_init (this=0x7fabc5665508, h_arg=0x7fabc5665088, seq_funcs=0x7fabac05e0a8, seq_init_param=0x7fabac05deb8, n_ranges=14, mode=1665, buf=0x7fabac05e0d8) at /data/src/10.1/sql/multi_range_read.cc:936 #8 0x00007fabc484360c in ha_tokudb::multi_range_read_init (this=0x7fabc5665088, seq=0x7fabac05e0a8, seq_init_param=0x7fabac05deb8, n_ranges=14, mode=1665, buf=0x7fabac05e0d8) at /data/src/10.1/storage/tokudb/ha_tokudb_mrr_maria.cc:37 #9 0x00007fabd09bd591 in JOIN_TAB_SCAN_MRR::open (this=0x7fabac05e080) at /data/src/10.1/sql/sql_join_cache.cc:3894 #10 0x00007fabd09bb00e in JOIN_CACHE::join_matching_records (this=0x7fabac05deb8, skip_last=false) at /data/src/10.1/sql/sql_join_cache.cc:2260 #11 0x00007fabd09babb6 in JOIN_CACHE::join_records (this=0x7fabac05deb8, skip_last=false) at /data/src/10.1/sql/sql_join_cache.cc:2097 #12 0x00007fabd08b0a52 in sub_select_cache (join=0x7fabc5448210, join_tab=0x7fabac05cb68, end_of_records=false) at /data/src/10.1/sql/sql_select.cc:18200 #13 0x00007fabd09bb461 in JOIN_CACHE::generate_full_extensions (this=0x7fabac05dbf8, rec_ptr=0x7fabc559c800 "") at /data/src/10.1/sql/sql_join_cache.cc:2410 #14 0x00007fabd09bb1a4 in JOIN_CACHE::join_matching_records (this=0x7fabac05dbf8, skip_last=false) at /data/src/10.1/sql/sql_join_cache.cc:2302 #15 0x00007fabd09babb6 in JOIN_CACHE::join_records (this=0x7fabac05dbf8, skip_last=false) at /data/src/10.1/sql/sql_join_cache.cc:2097 #16 0x00007fabd08b098d in sub_select_cache (join=0x7fabc5448210, join_tab=0x7fabac05c820, end_of_records=true) at /data/src/10.1/sql/sql_select.cc:18180 #17 0x00007fabd08b0b77 in sub_select (join=0x7fabc5448210, join_tab=0x7fabac05c4d8, end_of_records=true) at /data/src/10.1/sql/sql_select.cc:18350 #18 0x00007fabd08b064f in do_select (join=0x7fabc5448210, fields=0x7fabc5443ad0, table=0x0, procedure=0x0) at /data/src/10.1/sql/sql_select.cc:18056 #19 0x00007fabd088b070 in JOIN::exec_inner (this=0x7fabc5448210) at /data/src/10.1/sql/sql_select.cc:3225 #20 0x00007fabd0888301 in JOIN::exec (this=0x7fabc5448210) at /data/src/10.1/sql/sql_select.cc:2512 #21 0x00007fabd0b2eef6 in subselect_single_select_engine::exec (this=0x7fabc5447578) at /data/src/10.1/sql/item_subselect.cc:3840 #22 0x00007fabd0b24e5b in Item_subselect::exec (this=0x7fabc5447428) at /data/src/10.1/sql/item_subselect.cc:710 #23 0x00007fabd0b277b8 in Item_exists_subselect::val_int (this=0x7fabc5447428) at /data/src/10.1/sql/item_subselect.cc:1582 #24 0x00007fabd0792d09 in Item::val_int_result (this=0x7fabc5447428) at /data/src/10.1/sql/item.h:1034 #25 0x00007fabd0aab36f in Item_cache_int::cache_value (this=0x7fabac060138) at /data/src/10.1/sql/item.cc:8891 #26 0x00007fabd0ab2eca in Item_cache_wrapper::cache (this=0x7fabac060088) at /data/src/10.1/sql/item.cc:7738 #27 0x00007fabd0aa7fb6 in Item_cache_wrapper::val_int (this=0x7fabac060088) at /data/src/10.1/sql/item.cc:7792 #28 0x00007fabd0ab90bf in Item_in_optimizer::val_int (this=0x7fabac010170) at /data/src/10.1/sql/item_cmpfunc.cc:1552 #29 0x00007fabd08b1175 in evaluate_join_record (join=0x7fabc54476f8, join_tab=0x7fabac011380, error=0) at /data/src/10.1/sql/sql_select.cc:18497 #30 0x00007fabd08b0d9f in sub_select (join=0x7fabc54476f8, join_tab=0x7fabac011380, end_of_records=false) at /data/src/10.1/sql/sql_select.cc:18398 #31 0x00007fabd08b0602 in do_select (join=0x7fabc54476f8, fields=0x7fabc73fd2d0, table=0x0, procedure=0x0) at /data/src/10.1/sql/sql_select.cc:18053 #32 0x00007fabd088b070 in JOIN::exec_inner (this=0x7fabc54476f8) at /data/src/10.1/sql/sql_select.cc:3225 #33 0x00007fabd0888301 in JOIN::exec (this=0x7fabc54476f8) at /data/src/10.1/sql/sql_select.cc:2512 #34 0x00007fabd088b897 in mysql_select (thd=0x7fabc73f9070, rref_pointer_array=0x7fabc73fd448, tables=0x7fabc54433b0, wild_num=1, fields=..., conds=0x7fabc5447428, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=0x7fabc54476d8, unit=0x7fabc73fcab8, select_lex=0x7fabc73fd1b8) at /data/src/10.1/sql/sql_select.cc:3449 #35 0x00007fabd08812bc in handle_select (thd=0x7fabc73f9070, lex=0x7fabc73fc9f0, result=0x7fabc54476d8, setup_tables_done_option=0) at /data/src/10.1/sql/sql_select.cc:384 #36 0x00007fabd0851166 in execute_sqlcom_select (thd=0x7fabc73f9070, all_tables=0x7fabc54433b0) at /data/src/10.1/sql/sql_parse.cc:5923 #37 0x00007fabd0847662 in mysql_execute_command (thd=0x7fabc73f9070) at /data/src/10.1/sql/sql_parse.cc:2980 #38 0x00007fabd0854871 in mysql_parse (thd=0x7fabc73f9070, rawbuf=0x7fabc5443088 "SELECT * FROM t1 AS t1_outer WHERE EXISTS ( SELECT * FROM t2 WHERE i2 IN ( SELECT i3 FROM t3 INNER JOIN t1 AS t1_inner ON (t1_inner.c1 = c3 ) WHERE t1_inner.i1 < t1_outer.i1 ) )", length=177, parser_state=0x7fabd175f5e0) at /data/src/10.1/sql/sql_parse.cc:7339 #39 0x00007fabd084385c in dispatch_command (command=COM_QUERY, thd=0x7fabc73f9070, packet=0x7fabc8d3e071 "", packet_length=177) at /data/src/10.1/sql/sql_parse.cc:1490 #40 0x00007fabd08425bb in do_command (thd=0x7fabc73f9070) at /data/src/10.1/sql/sql_parse.cc:1109 #41 0x00007fabd097a125 in do_handle_one_connection (thd_arg=0x7fabc73f9070) at /data/src/10.1/sql/sql_connect.cc:1349 #42 0x00007fabd0979e89 in handle_one_connection (arg=0x7fabc73f9070) at /data/src/10.1/sql/sql_connect.cc:1261 #43 0x00007fabd0c5e1bc in pfs_spawn_thread (arg=0x7fabcd827ef0) at /data/src/10.1/storage/perfschema/pfs.cc:1860 #44 0x00007fabcff3e064 in start_thread (arg=0x7fabd1760b00) at pthread_create.c:309 #45 0x00007fabce0f662d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 INSTALL SONAME 'ha_tokudb' ;   CREATE TABLE t1 (pk INT PRIMARY KEY , i1 INT , c1 VARCHAR (1), KEY (c1)) ENGINE=TokuDB; INSERT INTO t1 VALUES (1,2, NULL ),(2,5, 'x' ),(3,3, 'i' ),(4,1, 'e' ),(5,4, 'p' );   CREATE TABLE t2 (i2 INT ) ENGINE=TokuDB; INSERT INTO t2 VALUES (1),(2);   CREATE TABLE t3 (i3 INT , c3 VARCHAR (1), KEY (i3)) ENGINE=TokuDB; INSERT INTO t3 VALUES (1, 'e' ),(1, 'z' ),(1, 'i' ),(1, 'q' ),(1, 'i' ),(1, 'f' ),(1, 'm' ),(2, 'c' ),(1, 'd' ), (2, 'n' ),(1, 't' ),(2, 'e' ),(1, 'w' ),(2, 'y' ),(1, 'j' ),(2, 'i' ),(1, 'f' ),(2, 'f' ), (1, 's' ),(2, 'y' );   SET join_cache_level = 8; SET optimizer_switch = 'mrr=on,mrr_sort_keys=on,optimize_join_buffer_size=on' ;   SELECT * FROM t1 AS t1_outer WHERE EXISTS ( SELECT * FROM t2 WHERE i2 IN ( SELECT i3 FROM t3 INNER JOIN t1 AS t1_inner ON (t1_inner.c1 = c3 ) WHERE t1_inner.i1 < t1_outer.i1 ) );   DROP TABLE t1, t2, t3; UNINSTALL SONAME 'ha_tokudb' ; Also reproducible on 10.0 and 10.2. Didn't get the crash on 5.5, but with valgrind it also misbehaves.
            elenst Elena Stepanova made changes -
            Component/s Optimizer [ 10200 ]
            Component/s Storage Engine - XtraDB [ 10135 ]
            Fix Version/s 10.0 [ 16000 ]
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.2 [ 14601 ]
            Affects Version/s 10.0 [ 16000 ]
            Affects Version/s 10.1 [ 16100 ]
            Affects Version/s 10.2 [ 14601 ]
            Assignee Varun Gupta [ varun ]
            elenst Elena Stepanova made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            Ninpo Alex Boag-Munroe added a comment - - edited

            Hi Elena,

            Sorry I've not provided the info you asked for yet, being the sole tech in my role has got time pretty tight.

            Our current status now is no tables using the tokudb engine but the plugin is still installed and we had another signal 11 crash in libjemalloc a couple of days ago (much longer between crashes, over 30 days instead of a week or two) but it still happened. Sadly apport crashed while handling the 200+GB core, so I've removed apport's involvement and set up for traditional stack traces.

            Worth noting at time of crash the optimizer switches were:

            index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=on,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=off,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
            

            Ninpo Alex Boag-Munroe added a comment - - edited Hi Elena, Sorry I've not provided the info you asked for yet, being the sole tech in my role has got time pretty tight. Our current status now is no tables using the tokudb engine but the plugin is still installed and we had another signal 11 crash in libjemalloc a couple of days ago (much longer between crashes, over 30 days instead of a week or two) but it still happened. Sadly apport crashed while handling the 200+GB core, so I've removed apport's involvement and set up for traditional stack traces. Worth noting at time of crash the optimizer switches were: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=on,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=off,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off

            Ninpo,

            The initial crash clearly starts somewhere in Mrr, so my guess is that the new one, which happened when all mrr is off, would be different. We'll need a stack trace of that, for starters at least the one from the error log if you don't have the coredump (and will probably need another bug report).

            elenst Elena Stepanova added a comment - Ninpo , The initial crash clearly starts somewhere in Mrr, so my guess is that the new one, which happened when all mrr is off, would be different. We'll need a stack trace of that, for starters at least the one from the error log if you don't have the coredump (and will probably need another bug report).
            Ninpo Alex Boag-Munroe made changes -
            Attachment _usr_sbin_mysqld.112.crash [ 43857 ]
            Ninpo Alex Boag-Munroe added a comment - - edited

            170710 15:49:25 [ERROR] mysqld got signal 11 ;
            This could be because you hit a bug. It is also possible that this binary
            or one of the libraries it was linked against is corrupt, improperly built,
            or misconfigured. This error can also be caused by malfunctioning hardware.
             
            To report this bug, see https://mariadb.com/kb/en/reporting-bugs
             
            We will try our best to scrape up some info that will hopefully help
            diagnose the problem, but since we have already crashed, 
            something is definitely wrong and this may fail.
             
            Server version: 10.1.24-MariaDB-1~xenial
            key_buffer_size=1073741824
            read_buffer_size=262144
            max_used_connections=1531
            max_threads=361
            thread_count=110
            It is possible that mysqld could use up to 
            key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1275725 K  bytes of memory
            Hope that's ok; if not, decrease some variables in the equation.
             
            Thread pointer: 0x7f99fe48e008
            Attempting backtrace. You can use the following information to find out
            where mysqld died. If you see no messages after this, something went
            terribly wrong...
            

            That's all that appeared in the mysql error log.

            _usr_sbin_mysqld.112.crash

            That's all apport gave in the end. I appreciate this isn't much to go on.

            The line in dmesg claimed the crash occurred in jemalloc and I'm wondering if we're still having a problem just by virtue of tokudb being loaded still, as it took much longer this time for the crash to develop. The other symptoms are the same as before, practically nothing in the mysqld log, signal 11.

            Ninpo Alex Boag-Munroe added a comment - - edited 170710 15:49:25 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware.   To report this bug, see https://mariadb.com/kb/en/reporting-bugs   We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.   Server version: 10.1.24-MariaDB-1~xenial key_buffer_size=1073741824 read_buffer_size=262144 max_used_connections=1531 max_threads=361 thread_count=110 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1275725 K bytes of memory Hope that's ok; if not, decrease some variables in the equation.   Thread pointer: 0x7f99fe48e008 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... That's all that appeared in the mysql error log. _usr_sbin_mysqld.112.crash That's all apport gave in the end. I appreciate this isn't much to go on. The line in dmesg claimed the crash occurred in jemalloc and I'm wondering if we're still having a problem just by virtue of tokudb being loaded still, as it took much longer this time for the crash to develop. The other symptoms are the same as before, practically nothing in the mysqld log, signal 11.
            Ninpo Alex Boag-Munroe made changes -
            Attachment show-create-index.txt [ 43858 ]

            @elena:

            Attached is the show create/show index output you asked for. show-create-index.txt

            Ninpo Alex Boag-Munroe added a comment - @elena: Attached is the show create/show index output you asked for. show-create-index.txt

            Thanks. So far I couldn't make a test case for InnoDB, even with given structures, although I'm getting the exact same execution plan as in example-query.sql. So, for now let's stick to the test case with TokuDB; I expect when Varun (or whoever ends up fixing it) finds the reason of the failure, he will be able to construct a test case with an arbitrary engine. If not, and the found reason turns out to be TokuDB-specific, we'll have to return to the drawing board.

            elenst Elena Stepanova added a comment - Thanks. So far I couldn't make a test case for InnoDB, even with given structures, although I'm getting the exact same execution plan as in example-query.sql. So, for now let's stick to the test case with TokuDB; I expect when Varun (or whoever ends up fixing it) finds the reason of the failure, he will be able to construct a test case with an arbitrary engine. If not, and the found reason turns out to be TokuDB-specific, we'll have to return to the drawing board.

            I was thinking about this and popped the following questions onto IRC, probably better suited here:

            Could it be a certain memory saturation issue regardling the size of our buffer pool/memory footprint and jemalloc? I recall there being settings you had to set for vm memory allocation for redis when it came to using jemalloc but no mention of anything like that for mariadb, vm.overcommit I think the setting is (the random sig11 after a period of time)

            We don't use hugepages traditionally because of toku, there's 200GB+ in active memory when the crash comes down. Lots of big numbers flying around (400k+ open tables, 200GB+ in memory data) and runtime seems to matter, that or there's a % chance the crash can happy on any given culprit event and whatever changes we made in removing tokudb engine usage has affected that % chance of crash.

            Ninpo Alex Boag-Munroe added a comment - I was thinking about this and popped the following questions onto IRC, probably better suited here: Could it be a certain memory saturation issue regardling the size of our buffer pool/memory footprint and jemalloc? I recall there being settings you had to set for vm memory allocation for redis when it came to using jemalloc but no mention of anything like that for mariadb, vm.overcommit I think the setting is (the random sig11 after a period of time) We don't use hugepages traditionally because of toku, there's 200GB+ in active memory when the crash comes down. Lots of big numbers flying around (400k+ open tables, 200GB+ in memory data) and runtime seems to matter, that or there's a % chance the crash can happy on any given culprit event and whatever changes we made in removing tokudb engine usage has affected that % chance of crash.
            varun Varun Gupta (Inactive) added a comment - - edited

            In the function DsMrr_impl::setup_buffer_sharing, buffer space is shared between the rowid buffer and key buffer

             
              /* The whole buffer space that we're using */
              uchar *full_buf;
              uchar *full_buf_end;
            
            

            Size of buffer

            (gdb) p full_buf_end - full_buf
            $1 = 11
             
            (gdb) p key_buff_elem_size
            $3 = 12
            (gdb) p rowid_buf_elem_size
            $4 = 17
             
            (gdb) p bytes_for_keys
            $9 = 5
            (gdb) p bytes_for_rowids
            $12 = 6
            

            This condition will be true

            if (bytes_for_keys < key_buff_elem_size + 1)
            {
               ptrdiff_t add= key_buff_elem_size + 1 - bytes_for_keys;
               bytes_for_keys= key_buff_elem_size + 1;
               bytes_for_rowids -= add;
            }
            

            (gdb) p bytes_for_rowids
            $12 = -2
            (gdb) p bytes_for_keys
            $12 = 13
            

            if (bytes_for_rowids < (ptrdiff_t)rowid_buf_elem_size + 1)
            {
                ptrdiff_t add= (ptrdiff_t)(rowid_buf_elem_size + 1 - bytes_for_rowids);
                bytes_for_rowids= (ptrdiff_t)rowid_buf_elem_size + 1;
                bytes_for_keys -= add;
            }
            

            (gdb) p bytes_for_rowids
            $12 = 18
            (gdb) p bytes_for_keys
            $12 = -7
            
            

            So here we see, we are running over the buffer that is we are accessing memory which is not alloted

            (gdb) p (full_buf + bytes_for_rowids) > full_buf_end
            $2 = true
            
            

            varun Varun Gupta (Inactive) added a comment - - edited In the function DsMrr_impl::setup_buffer_sharing, buffer space is shared between the rowid buffer and key buffer   /* The whole buffer space that we're using */ uchar *full_buf; uchar *full_buf_end; Size of buffer (gdb) p full_buf_end - full_buf $1 = 11   (gdb) p key_buff_elem_size $3 = 12 (gdb) p rowid_buf_elem_size $4 = 17   (gdb) p bytes_for_keys $9 = 5 (gdb) p bytes_for_rowids $12 = 6 This condition will be true if (bytes_for_keys < key_buff_elem_size + 1) { ptrdiff_t add= key_buff_elem_size + 1 - bytes_for_keys; bytes_for_keys= key_buff_elem_size + 1; bytes_for_rowids -= add; } (gdb) p bytes_for_rowids $12 = -2 (gdb) p bytes_for_keys $12 = 13 if (bytes_for_rowids < (ptrdiff_t)rowid_buf_elem_size + 1) { ptrdiff_t add= (ptrdiff_t)(rowid_buf_elem_size + 1 - bytes_for_rowids); bytes_for_rowids= (ptrdiff_t)rowid_buf_elem_size + 1; bytes_for_keys -= add; } (gdb) p bytes_for_rowids $12 = 18 (gdb) p bytes_for_keys $12 = -7 So here we see, we are running over the buffer that is we are accessing memory which is not alloted (gdb) p (full_buf + bytes_for_rowids) > full_buf_end $2 = true
            elenst Elena Stepanova made changes -
            Comment [ A comment with security level 'Users' was removed. ]
            elenst Elena Stepanova made changes -
            Labels need_feedback
            jplindst Jan Lindström (Inactive) made changes -
            Attachment stack1 [ 43918 ]

            Added one of the latest full stack trace from core file.

            jplindst Jan Lindström (Inactive) added a comment - Added one of the latest full stack trace from core file.

            Worth noting, apparently the crash occurs in this code whether any of the mrr optimizer switches are enabled or not.

            Ninpo Alex Boag-Munroe added a comment - Worth noting, apparently the crash occurs in this code whether any of the mrr optimizer switches are enabled or not.

            Alex,
            The comment just before the class Mrr_simple_index_reader (multi_range_read.h) says:

            /*
              A "bypass" index reader that just does and index scan. The index scan is done 
              by calling default MRR implementation (i.e.  handler::multi_range_read_XXX())
              functions.
            */
            

            So the server performs just a regular index scan and crashes there.

            igor Igor Babaev (Inactive) added a comment - Alex, The comment just before the class Mrr_simple_index_reader (multi_range_read.h) says: /* A "bypass" index reader that just does and index scan. The index scan is done by calling default MRR implementation (i.e. handler::multi_range_read_XXX()) functions. */ So the server performs just a regular index scan and crashes there.
            jplindst Jan Lindström (Inactive) made changes -
            monty Michael Widenius made changes -
            Assignee Varun Gupta [ varun ] Michael Widenius [ monty ]
            monty Michael Widenius made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]

            Problem was a memory overflow in DsMrr_impl::setup_buffer_sharing() when used with a very small buffer.

            monty Michael Widenius added a comment - Problem was a memory overflow in DsMrr_impl::setup_buffer_sharing() when used with a very small buffer.
            monty Michael Widenius made changes -
            issue.field.resolutiondate 2017-08-03 16:29:44.0 2017-08-03 16:29:44.103
            monty Michael Widenius made changes -
            Fix Version/s 10.0.32 [ 22504 ]
            Fix Version/s 10.1.26 [ 22553 ]
            Fix Version/s 10.2.8 [ 22544 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.0 [ 16000 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            dbart Daniel Bartholomew made changes -
            Fix Version/s 10.0.33 [ 22552 ]
            Fix Version/s 10.0.32 [ 22504 ]
            dbart Daniel Bartholomew made changes -
            Fix Version/s 10.0.32 [ 22504 ]
            Fix Version/s 10.0.33 [ 22552 ]
            varun Varun Gupta (Inactive) added a comment - - edited

            SET join_cache_level = 0;
            SET optimizer_switch ='mrr=on,mrr_sort_keys=on,optimize_join_buffer_size=on';
            ANALYZE 
            SELECT * FROM t1 AS t1_outer WHERE EXISTS ( SELECT * FROM t2 WHERE i2 IN ( SELECT i3 FROM t3 INNER JOIN t1 AS t1_inner ON (t1_inner.c1 = c3 ) WHERE t1_inner.i1 < t1_outer.i1 ) );
            id	select_type	table	type	possible_keys	key	key_len	ref	rows	r_rows	filtered	r_filtered	Extra
            1	PRIMARY	t1_outer	ALL	NULL	NULL	NULL	NULL	5	5.00	100.00	80.00	Using where
            2	DEPENDENT SUBQUERY	t2	ALL	NULL	NULL	NULL	NULL	2	1.20	100.00	100.00	Using where
            2	DEPENDENT SUBQUERY	t3	ref	i3	i3	5	test.t2.i2	1	4.00	100.00	100.00	Using where
            2	DEPENDENT SUBQUERY	t1_inner	ref	c1	c1	4	test.t3.c3	1	0.38	100.00	44.44	Using where; FirstMatch(t2)
            
            

            varun Varun Gupta (Inactive) added a comment - - edited SET join_cache_level = 0; SET optimizer_switch ='mrr=on,mrr_sort_keys=on,optimize_join_buffer_size=on'; ANALYZE SELECT * FROM t1 AS t1_outer WHERE EXISTS ( SELECT * FROM t2 WHERE i2 IN ( SELECT i3 FROM t3 INNER JOIN t1 AS t1_inner ON (t1_inner.c1 = c3 ) WHERE t1_inner.i1 < t1_outer.i1 ) ); id select_type table type possible_keys key key_len ref rows r_rows filtered r_filtered Extra 1 PRIMARY t1_outer ALL NULL NULL NULL NULL 5 5.00 100.00 80.00 Using where 2 DEPENDENT SUBQUERY t2 ALL NULL NULL NULL NULL 2 1.20 100.00 100.00 Using where 2 DEPENDENT SUBQUERY t3 ref i3 i3 5 test.t2.i2 1 4.00 100.00 100.00 Using where 2 DEPENDENT SUBQUERY t1_inner ref c1 c1 4 test.t3.c3 1 0.38 100.00 44.44 Using where; FirstMatch(t2)
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 81055 ] MariaDB v4 [ 152258 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 135460

            People

              monty Michael Widenius
              Ninpo Alex Boag-Munroe
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.