Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6450

MariaDB crash on Power8 when built with advance tool chain

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.0.12
    • 10.0.13
    • None
    • None
    • Power8 RH6.5 (big endian)

    Description

      NB: Fix for this bug also present in Stewart Smith' patchset: mysql-power-sync-mutex-lockword.patch

      Multithreaded workload that includes writes to the database, causes a crash of the MariaDB server. This happens only if MariaDB was built with the Advance Tool Chain (ATC) and has been verified for a local 10.0.12 (release) build as well as a buildbot build of 10.0/rev4292.

      To build with ATC, simply put /opt/at7.0/bin at the from of your $PATH and build like normal (cmake, make). The ATC contains enhanced versions of the GNU toolchain.

      The last 5 observed crashes are triggered by abort() within InnoDB. Typical error log:

      2014-07-16 11:15:19 fff8c207190 InnoDB: Assertion failure in thread 17590242013584 in file ut0lst.h line 271
      InnoDB: Failing assertion: list.count > 0
      InnoDB: We intentionally generate a memory trap.

      Typical backtrace from corefile:

      #0 0x00000fff8f583a40 in __pthread_kill (threadid=<optimized out>, signo=<optimized out>)
      at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:61
      #1 0x00000000109b5698 in my_write_core (sig=<optimized out>)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/mysys/stacktrace.c:457
      #2 0x00000000103d5324 in handle_fatal_signal (sig=<optimized out>)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/signal_handler.cc:262
      #3 <signal handler called>
      #4 0x00000fff8ed7f8f0 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
      #5 0x00000fff8ed818f4 in __GI_abort () at abort.c:89
      #6 0x0000000010124588 in ut_list_remove<ut_list_base<trx_t>, trx_t> (offset=568, elem=..., list=...)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/include/ut0lst.h:271
      #7 0x00000000107ee004 in ut_list_remove<ut_list_base<trx_t>, trx_t> (offset=<optimized out>, elem=...,
      list=...) at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1151
      #8 trx_commit_in_memory (lsn=365223679, trx=0xfff2800be08)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1408
      #9 trx_commit_low (trx=0xfff2800be08, mtr=<optimized out>)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1605
      #10 0x00000000107ee128 in trx_commit (trx=trx@entry=0xfff2800be08)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1626
      #11 0x00000000107ee8dc in trx_commit_for_mysql (trx=0xfff2800be08)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1854
      #12 0x00000000106c7468 in innobase_commit_low (trx=0xfff2800be08)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/handler/ha_innodb.cc:3929
      #13 innobase_commit_ordered_2 (trx=trx@entry=0xfff2800be08, thd=thd@entry=0x1001b841858)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/handler/ha_innodb.cc:4038
      #14 0x00000000106cae0c in innobase_commit (hton=0x1001ac45818, thd=0x1001b841858, commit_trx=<optimized out>)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/handler/ha_innodb.cc:4164
      #15 0x00000000103d78c4 in commit_one_phase_2 (thd=thd@entry=0x1001b841858, all=all@entry=true,
      is_real_trans=<optimized out>, trans=<optimized out>, trans=<optimized out>)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.cc:1514
      #16 0x00000000103d9ca4 in ha_commit_one_phase (all=true, thd=0x1001b841858)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.cc:1495
      #17 ha_commit_trans (thd=0x1001b841858, all=<optimized out>)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.cc:1372
      #18 0x000000001031bea4 in trans_commit (thd=0x1001b841858)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/transaction.cc:218
      #19 0x000000001020b2f0 in mysql_execute_command (thd=thd@entry=0x1001b841858)
      at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_parse.cc:4392

      How to repeat

      I have setup a test environment on the Power8 benchmark box (see internal wiki for details how to access the machine).

      • log in as user mariadb.
      • chdir to ~/benchmark/sysbench/series03.
      • run runme.sh.

      This will fire up a server from ~/mariadb-install/mariadb-10.0.12-atc, create a single table for sysbench OLTP and run a read/write OLTP test with 8 threads. Single/double threaded load did not crash, with 4 or more threads crashes start to happen.

      For convenience I symlinked the datadir and mysqld binary. So you can run gdb like so

      ~/benchmark/sysbench/series03 $/opt/at7.0/bin/gdb mysqld datadir/core...

      Attachments

        Issue Links

          Activity

            If this fix is correct, it should be merged also to 5.5

            jplindst Jan Lindström (Inactive) added a comment - If this fix is correct, it should be merged also to 5.5

            svoj, the patch looks good, thanks.

            But perhaps, you should do a similar change for Windows? InterlockedExchange is a full memory barrier. Windows has intrinsics with acquire and release semantics. For example, os_atomic_test_and_set_byte could be InterlockedExchangeAcquire, while os_atomic_lock_release_byte could be InterlockedAndRelease.

            serg Sergei Golubchik added a comment - svoj , the patch looks good, thanks. But perhaps, you should do a similar change for Windows? InterlockedExchange is a full memory barrier. Windows has intrinsics with acquire and release semantics. For example, os_atomic_test_and_set_byte could be InterlockedExchangeAcquire , while os_atomic_lock_release_byte could be InterlockedAndRelease .

            Sergei, please review addition to the original patch. It implements your suggestion.

            svoj Sergey Vojtovich added a comment - Sergei, please review addition to the original patch. It implements your suggestion.

            thanks, ok to push!

            serg Sergei Golubchik added a comment - thanks, ok to push!

            Pushed additional patch to 10.0.13:

            revno: 4319
            revision-id: svoj@mariadb.org-20140731103105-6jnn5m8sazm5fvpg
            parent: sergii@pisem.net-20140731161437-oxyzqaskmptm5ssv
            committer: Sergey Vojtovich <svoj@mariadb.org>
            branch nick: 10.0
            timestamp: Thu 2014-07-31 14:31:05 +0400
            message:
              MDEV-6450 - MariaDB crash on Power8 when built with advance
                          tool chain
              
              This is an addition to the original patch. On Windows
              InterlockedExchange implies full memory barrier, whereas
              only acquire/release barriers required.

            svoj Sergey Vojtovich added a comment - Pushed additional patch to 10.0.13: revno: 4319 revision-id: svoj@mariadb.org-20140731103105-6jnn5m8sazm5fvpg parent: sergii@pisem.net-20140731161437-oxyzqaskmptm5ssv committer: Sergey Vojtovich <svoj@mariadb.org> branch nick: 10.0 timestamp: Thu 2014-07-31 14:31:05 +0400 message: MDEV-6450 - MariaDB crash on Power8 when built with advance tool chain This is an addition to the original patch. On Windows InterlockedExchange implies full memory barrier, whereas only acquire/release barriers required.

            People

              svoj Sergey Vojtovich
              axel Axel Schwenke
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.