Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34906

Galera node crashes sporadically when running OLTP multi master load.

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.11
    • 10.11
    • Galera

    Description

      Galera node crashes sporadically when running OLTP multi master load. The crash seems to be related to innodb. Could not reproduce the issue in rr. Attached cnf files and full backtrace.

      Test case

      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb-admin --user=root --socket=/home/ramesh/framework/node3/mysql.sock shutdown
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb-admin --user=root --socket=/home/ramesh/framework/node2/mysql.sock shutdown
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb-admin --user=root --socket=/home/ramesh/framework/node1/mysql.sock shutdown
       
      rm -rf /home/ramesh/framework/node*
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/scripts/mariadb-install-db --no-defaults --force  --auth-root-authentication-method=normal  --basedir=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg --datadir=/home/ramesh/framework/node1 > /home/ramesh/framework/log/startup1.log 2>&1
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/scripts/mariadb-install-db --no-defaults --force  --auth-root-authentication-method=normal  --basedir=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg --datadir=/home/ramesh/framework/node2 > /home/ramesh/framework/log/startup2.log 2>&1
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/scripts/mariadb-install-db --no-defaults --force  --auth-root-authentication-method=normal  --basedir=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg --datadir=/home/ramesh/framework/node3 > /home/ramesh/framework/log/startup3.log 2>&1
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadbd --defaults-file=/home/ramesh/framework/conf/node1.cnf --wsrep-provider=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/lib/libgalera_smm.so  --basedir=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg --wsrep-new-cluster > /home/ramesh/framework/node1/node1.err 2>&1 &
       
      sleep 10
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb --user=root --socket=/home/ramesh/framework/node1/mysql.sock -Bse"SET SESSION sql_log_bin=0;delete from mysql.user where user='';" > /dev/null 2>&1
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadbd --defaults-file=/home/ramesh/framework/conf/node2.cnf --wsrep-provider=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/lib/libgalera_smm.so  --basedir=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg  > /home/ramesh/framework/node2/node2.err 2>&1 &
       
      sleep 20
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb --user=root --socket=/home/ramesh/framework/node2/mysql.sock -Bse"SET SESSION sql_log_bin=0;delete from mysql.user where user='';" > /dev/null 2>&1
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadbd --defaults-file=/home/ramesh/framework/conf/node3.cnf --wsrep-provider=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/lib/libgalera_smm.so  --basedir=/home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg   > /home/ramesh/framework/node3/node3.err 2>&1 &
       
      sleep 20
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb --user=root --socket=/home/ramesh/framework/node3/mysql.sock -Bse"SET SESSION sql_log_bin=0;delete from mysql.user where user='';" > /dev/null 2>&1
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb --user=root --socket=/home/ramesh/framework/node1/mysql.sock -e"drop database if exists test; create database test;" > /dev/null 2>&1
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb --user=root --socket=/home/ramesh/framework/node1/mysql.sock -e"create user if not exists sysbench@'localhost' identified  by 'sysbench';grant all on *.* to sysbench@'localhost';" > /dev/null 2>&1
       
      sysbench /usr/share/sysbench/oltp_insert.lua --table-size=1000 --tables=10 --threads=10 --mysql-db=test  --mysql-user=sysbench --mysql-password=sysbench --db-driver=mysql  --mysql-socket=/home/ramesh/framework/node1/mysql.sock prepare >/home/ramesh/framework/log/sysbench_prepare.log
       
      sysbench /usr/share/sysbench/oltp_read_write.lua --table-size=1000 --tables=10 --threads=10 --mysql-db=test  --mysql-user=sysbench --mysql-password=sysbench --db-driver=mysql  --mysql-socket=/home/ramesh/framework/node1/mysql.sock --time=1000 --db-ps-mode=disable run > /home/ramesh/framework/log/sysbench_read_write_10.log & 
       
      sysbench /usr/share/sysbench/oltp_read_write.lua --table-size=1000 --tables=10 --threads=10 --mysql-db=test  --mysql-user=sysbench --mysql-password=sysbench --db-driver=mysql  --mysql-socket=/home/ramesh/framework/node2/mysql.sock --time=1000 --db-ps-mode=disable run > /home/ramesh/framework/log/sysbench_read_write_10.log & 
       
      sysbench /usr/share/sysbench/oltp_read_write.lua --table-size=1000 --tables=10 --threads=10 --mysql-db=test  --mysql-user=sysbench --mysql-password=sysbench --db-driver=mysql  --mysql-socket=/home/ramesh/framework/node3/mysql.sock --time=1000 --db-ps-mode=disable run > /home/ramesh/framework/log/sysbench_read_write_10.log & 
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb-admin --user=root --socket=/home/ramesh/framework/node3/mysql.sock shutdown
       
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb --user=root --socket=/home/ramesh/framework/node1/mysql.sock -e"drop database if exists test_one; create database test_one;" > /dev/null 2>&1
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb --user=root --socket=/home/ramesh/framework/node1/mysql.sock -e"drop database if exists test_two; create database test_two;" > /dev/null 2>&1
      /home/ramesh/framework/GAL_MD040924-mariadb-10.11.10-linux-x86_64-dbg/bin/mariadb --user=root --socket=/home/ramesh/framework/node1/mysql.sock -e"drop database if exists test_three; create database test_three;" > /dev/null 2>&1
       
      sysbench /usr/share/sysbench/oltp_insert.lua --table-size=1000 --tables=10 --threads=10 --mysql-db=test_one  --mysql-user=sysbench --mysql-password=sysbench --db-driver=mysql  --mysql-socket=/home/ramesh/framework/node1/mysql.sock prepare >/home/ramesh/framework/log/sysbench_prepare_test_one.log
      sysbench /usr/share/sysbench/oltp_insert.lua --table-size=1000 --tables=10 --threads=10 --mysql-db=test_two  --mysql-user=sysbench --mysql-password=sysbench --db-driver=mysql  --mysql-socket=/home/ramesh/framework/node1/mysql.sock prepare >/home/ramesh/framework/log/sysbench_prepare_test_two.log
      sysbench /usr/share/sysbench/oltp_insert.lua --table-size=1000 --tables=10 --threads=10 --mysql-db=test_three  --mysql-user=sysbench --mysql-password=sysbench --db-driver=mysql  --mysql-socket=/home/ramesh/framework/node1/mysql.sock prepare >/home/ramesh/framework/log/sysbench_prepare_test_three.log
      

      Leads to

      (gdb) bt
      #0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
      #1  0x000055709a99c104 in my_write_core (sig=sig@entry=6) at /test/10.11_dbg/mysys/stacktrace.c:424
      #2  0x000055709a256763 in handle_fatal_signal (sig=6) at /test/10.11_dbg/sql/signal_handler.cc:366
      #3  <signal handler called>
      #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
      #5  0x000014b11937a859 in __GI_abort () at abort.c:79
      #6  0x000014b11937a729 in __assert_fail_base (fmt=0x14b119510588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55709ad7c74a "0", 
          file=0x55709ae1bab0 "/test/10.11_dbg/storage/innobase/row/row0purge.cc", line=832, function=<optimized out>) at assert.c:92
      #7  0x000014b11938bfd6 in __GI___assert_fail (assertion=assertion@entry=0x55709ad7c74a "0", file=file@entry=0x55709ae1bab0 "/test/10.11_dbg/storage/innobase/row/row0purge.cc", line=line@entry=832, 
          function=function@entry=0x55709ae1c118 "bool row_purge_remove_sec_if_poss_tree(purge_node_t*, dict_index_t*, const dtuple_t*, trx_id_t)") at assert.c:101
      #8  0x000055709a71973d in row_purge_remove_sec_if_poss_tree (node=node@entry=0x55709d9cb300, index=index@entry=0x14b0781eda80, entry=entry@entry=0x14b028008960, page_max_trx_id=page_max_trx_id@entry=263186)
          at /test/10.11_dbg/storage/innobase/row/row0purge.cc:832
      #9  0x000055709a71a85e in row_purge_remove_sec_if_poss (entry=0x14b028008960, index=0x14b0781eda80, node=0x55709d9cb300) at /test/10.11_dbg/storage/innobase/row/row0purge.cc:995
      #10 row_purge_del_mark (node=0x55709d9cb300) at /test/10.11_dbg/storage/innobase/row/row0purge.cc:1027
      #11 row_purge_record_func (node=node@entry=0x55709d9cb300, undo_rec=undo_rec@entry=0x14b10855094a "\tr\016\002\033", thr=thr@entry=0x55709d9ca720, updated_extern=<optimized out>)
          at /test/10.11_dbg/storage/innobase/row/row0purge.cc:1530
      #12 0x000055709a71c544 in row_purge (thr=<optimized out>, undo_rec=<optimized out>, node=<optimized out>) at /test/10.11_dbg/storage/innobase/row/row0purge.cc:1591
      #13 row_purge_step (thr=thr@entry=0x55709d9ca720) at /test/10.11_dbg/storage/innobase/row/row0purge.cc:1654
      #14 0x000055709a6925b8 in que_thr_step (thr=0x55709d9ca720) at /test/10.11_dbg/storage/innobase/que/que0que.cc:554
      #15 que_run_threads_low (thr=0x55709d9ca720) at /test/10.11_dbg/storage/innobase/que/que0que.cc:610
      #16 que_run_threads (thr=thr@entry=0x55709d9ca720) at /test/10.11_dbg/storage/innobase/que/que0que.cc:630
      #17 0x000055709a76b860 in srv_task_execute () at /test/10.11_dbg/storage/innobase/srv/srv0srv.cc:1439
      #18 srv_purge_worker_task_low () at /test/10.11_dbg/storage/innobase/srv/srv0srv.cc:1570
      #19 0x000055709a76c1bc in purge_worker_callback () at /test/10.11_dbg/storage/innobase/srv/srv0srv.cc:1581
      #20 0x000055709a93ca53 in tpool::task_group::execute (this=0x55709be36080 <purge_task_group>, t=t@entry=0x55709be173c0 <purge_worker_task>) at /test/10.11_dbg/tpool/task_group.cc:70
      #21 0x000055709a93cadd in tpool::task::execute (this=0x55709be173c0 <purge_worker_task>) at /test/10.11_dbg/tpool/task.cc:32
      #22 0x000055709a93aa9f in tpool::thread_pool_generic::worker_main (this=0x55709d93a940, thread_var=0x55709d93b480) at /test/10.11_dbg/tpool/tpool_generic.cc:583
      #23 0x000055709a93bcc4 in std::__invoke_impl<void, void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__t=<optimized out>, __f=<optimized out>)
          at /usr/include/c++/9/bits/invoke.h:89
      #24 std::__invoke<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> (__fn=<optimized out>) at /usr/include/c++/9/bits/invoke.h:95
      #25 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::_M_invoke<0ul, 1ul, 2ul> (this=<optimized out>)
          at /usr/include/c++/9/thread:244
      #26 std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> >::operator() (this=<optimized out>)
          at /usr/include/c++/9/thread:251
      #27 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (tpool::thread_pool_generic::*)(tpool::worker_data*), tpool::thread_pool_generic*, tpool::worker_data*> > >::_M_run (this=<optimized out>)
          at /usr/include/c++/9/thread:195
      #28 0x000014b119771de4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
      #29 0x000014b11988b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
      #30 0x000014b119477133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
      (gdb) 
      

      Attachments

        1. bt_all.txt
          115 kB
          Ramesh Sivaraman
        2. n1.cnf
          1.0 kB
          Ramesh Sivaraman
        3. n2.cnf
          1.0 kB
          Ramesh Sivaraman
        4. n3.cnf
          0.8 kB
          Ramesh Sivaraman

        Activity

          People

            marko Marko Mäkelä
            ramesh Ramesh Sivaraman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.