[MDEV-6450] MariaDB crash on Power8 when built with advance tool chain Created: 2014-07-16  Updated: 2014-08-29  Resolved: 2014-08-04

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.12
Fix Version/s: 10.0.13

Type: Bug Priority: Critical
Reporter: Axel Schwenke Assignee: Sergey Vojtovich
Resolution: Fixed Votes: 0
Labels: None
Environment:

Power8 RH6.5 (big endian)


Issue Links:
Duplicate
is duplicated by MDEV-6615 Incorrect barrier logic in mutex_exit... Closed
PartOf
is part of MDEV-6478 MariaDB on Power8 Closed
is part of MDEV-6530 Examine and apply Power8 patches sugg... Closed

 Description   

NB: Fix for this bug also present in Stewart Smith' patchset: mysql-power-sync-mutex-lockword.patch

Multithreaded workload that includes writes to the database, causes a crash of the MariaDB server. This happens only if MariaDB was built with the Advance Tool Chain (ATC) and has been verified for a local 10.0.12 (release) build as well as a buildbot build of 10.0/rev4292.

To build with ATC, simply put /opt/at7.0/bin at the from of your $PATH and build like normal (cmake, make). The ATC contains enhanced versions of the GNU toolchain.

The last 5 observed crashes are triggered by abort() within InnoDB. Typical error log:

2014-07-16 11:15:19 fff8c207190 InnoDB: Assertion failure in thread 17590242013584 in file ut0lst.h line 271
InnoDB: Failing assertion: list.count > 0
InnoDB: We intentionally generate a memory trap.

Typical backtrace from corefile:

#0 0x00000fff8f583a40 in __pthread_kill (threadid=<optimized out>, signo=<optimized out>)
at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:61
#1 0x00000000109b5698 in my_write_core (sig=<optimized out>)
at /home/mariadb/mariadb-source/mariadb-10.0.12/mysys/stacktrace.c:457
#2 0x00000000103d5324 in handle_fatal_signal (sig=<optimized out>)
at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/signal_handler.cc:262
#3 <signal handler called>
#4 0x00000fff8ed7f8f0 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5 0x00000fff8ed818f4 in __GI_abort () at abort.c:89
#6 0x0000000010124588 in ut_list_remove<ut_list_base<trx_t>, trx_t> (offset=568, elem=..., list=...)
at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/include/ut0lst.h:271
#7 0x00000000107ee004 in ut_list_remove<ut_list_base<trx_t>, trx_t> (offset=<optimized out>, elem=...,
list=...) at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1151
#8 trx_commit_in_memory (lsn=365223679, trx=0xfff2800be08)
at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1408
#9 trx_commit_low (trx=0xfff2800be08, mtr=<optimized out>)
at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1605
#10 0x00000000107ee128 in trx_commit (trx=trx@entry=0xfff2800be08)
at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1626
#11 0x00000000107ee8dc in trx_commit_for_mysql (trx=0xfff2800be08)
at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/trx/trx0trx.cc:1854
#12 0x00000000106c7468 in innobase_commit_low (trx=0xfff2800be08)
at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/handler/ha_innodb.cc:3929
#13 innobase_commit_ordered_2 (trx=trx@entry=0xfff2800be08, thd=thd@entry=0x1001b841858)
at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/handler/ha_innodb.cc:4038
#14 0x00000000106cae0c in innobase_commit (hton=0x1001ac45818, thd=0x1001b841858, commit_trx=<optimized out>)
at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/handler/ha_innodb.cc:4164
#15 0x00000000103d78c4 in commit_one_phase_2 (thd=thd@entry=0x1001b841858, all=all@entry=true,
is_real_trans=<optimized out>, trans=<optimized out>, trans=<optimized out>)
at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.cc:1514
#16 0x00000000103d9ca4 in ha_commit_one_phase (all=true, thd=0x1001b841858)
at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.cc:1495
#17 ha_commit_trans (thd=0x1001b841858, all=<optimized out>)
at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.cc:1372
#18 0x000000001031bea4 in trans_commit (thd=0x1001b841858)
at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/transaction.cc:218
#19 0x000000001020b2f0 in mysql_execute_command (thd=thd@entry=0x1001b841858)
at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_parse.cc:4392

How to repeat

I have setup a test environment on the Power8 benchmark box (see internal wiki for details how to access the machine).

  • log in as user mariadb.
  • chdir to ~/benchmark/sysbench/series03.
  • run runme.sh.

This will fire up a server from ~/mariadb-install/mariadb-10.0.12-atc, create a single table for sysbench OLTP and run a read/write OLTP test with 8 threads. Single/double threaded load did not crash, with 4 or more threads crashes start to happen.

For convenience I symlinked the datadir and mysqld binary. So you can run gdb like so

~/benchmark/sysbench/series03 $/opt/at7.0/bin/gdb mysqld datadir/core...



 Comments   
Comment by Axel Schwenke [ 2014-07-16 ]

after looking at MDEV-6340 I added -fno-delete-null-pointer-checks to CFLAGS. But this doesn't solve the issue.

Comment by Axel Schwenke [ 2014-07-16 ]

A crash happens immediately for a debug build. When the rw_lock_debug_mutex is released, it turns out the thread didn't own it. Huh?

Errorlog:

2014-07-16 16:43:39 fff882d7190  InnoDB: Assertion failure in thread 17590175756688 in file sync0sync.ic line 166
InnoDB: Failing assertion: mutex_own(mutex)
InnoDB: We intentionally generate a memory trap.

Stack:

#0  0x00000fffa54e3a40 in __pthread_kill (threadid=<optimized out>, signo=<optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:61
#1  0x0000000010d1346c in my_write_core (sig=6)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/mysys/stacktrace.c:457
#2  0x0000000010508284 in handle_fatal_signal (sig=6)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/signal_handler.cc:262
#3  <signal handler called>
#4  0x00000fffa4cdf8f0 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5  0x00000fffa4ce18f4 in __GI_abort () at abort.c:89
#6  0x0000000010a7b1d4 in mutex_exit_func (mutex=0x11ca9ed0 <rw_lock_debug_mutex>)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/include/sync0sync.ic:166
#7  0x0000000010a7b630 in pfs_mutex_exit_func (mutex=0x11ca9ed0 <rw_lock_debug_mutex>)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/include/sync0sync.ic:477
#8  0x0000000010a7dc3c in rw_lock_debug_mutex_exit ()
    at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/sync/sync0rw.cc:948
#9  0x0000000010a7df28 in rw_lock_remove_debug_info (lock=0x10018e35f98, pass=0, lock_type=352)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/sync/sync0rw.cc:1025
#10 0x0000000010a358f4 in rw_lock_s_unlock_func (pass=0, lock=0x10018e35f98)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/include/sync0rw.ic:614
#11 0x0000000010a35bbc in pfs_rw_lock_s_unlock_func (pass=0, lock=0x10018e35f98)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/include/sync0rw.ic:1251
#12 0x0000000010a42e70 in row_search_for_mysql (buf=0xfff3400adc8 "\377RQ", mode=2, prebuilt=0xfff340327a8,
    match_mode=1, direction=0)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/row/row0sel.cc:3999
#13 0x00000000108c563c in ha_innobase::index_read (this=0xfff34008d78, buf=0xfff3400adc8 "\377RQ",
    key_ptr=0xfff48007070 "RQ", key_len=4, find_flag=HA_READ_KEY_EXACT)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/storage/xtradb/handler/ha_innodb.cc:8371
#14 0x000000001051c91c in handler::index_read_map (this=0xfff34008d78, buf=0xfff3400adc8 "\377RQ",
    key=0xfff48007070 "RQ", keypart_map=1, find_flag=HA_READ_KEY_EXACT)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.h:2888
#15 0x00000000105193e4 in handler::index_read_idx_map (this=0xfff34008d78, buf=0xfff3400adc8 "\377RQ",
    index=0, key=0xfff48007070 "RQ", keypart_map=1, find_flag=HA_READ_KEY_EXACT)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.cc:5520
#16 0x0000000010510d90 in handler::ha_index_read_idx_map (this=0xfff34008d78, buf=0xfff3400adc8 "\377RQ",
    index=0, key=0xfff48007070 "RQ", keypart_map=1, find_flag=HA_READ_KEY_EXACT)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/handler.cc:2609
#17 0x00000000102fd258 in join_read_const (tab=0xfff48006628)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_select.cc:18167
#18 0x00000000102fcc10 in join_read_const_table (tab=0xfff48006628, pos=0xfff48006b60)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_select.cc:18042
#19 0x00000000102d2f7c in make_join_statistics (join=0xfff48005df0, tables_list=..., conds=0xfff480063f8,
    keyuse_array=0xfff480060f8) at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_select.cc:3785
#20 0x00000000102ca178 in JOIN::optimize_inner (this=0xfff48005df0)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_select.cc:1338
#21 0x00000000102c9130 in JOIN::optimize (this=0xfff48005df0)
    at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_select.cc:1023
#22 0x00000000102d143c in mysql_select (thd=0x100196077b0, rref_pointer_array=0x1001960be18,
    tables=0xfff48005438, wild_num=0, fields=..., conds=0xfff48005bc8, og_num=0, order=0x0, group=0x0,
    having=0x0, proc_param=0x0, select_options=2148797184, result=0xfff48005dd0, unit=0x1001960b4b8,
    select_lex=0x1001960bba0) at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_select.cc:3289
#23 0x00000000102c6f84 in handle_select (thd=0x100196077b0, lex=0x1001960b3f0, result=0xfff48005dd0,
    setup_tables_done_option=0) at /home/mariadb/mariadb-source/mariadb-10.0.12/sql/sql_select.cc:372
...

CFLAGS:

-Wall  -fPIC -Wall -g -DENABLED_DEBUG_SYNC -ggdb3 -DSAFE_MUTEX -DSAFEMALLOC

binaries in ~/mariadb-install-/mariadb-10.0.12-atc-debug

Comment by Axel Schwenke [ 2014-07-17 ]

OK, I guess I found the source of the issue: the gcc atomix builtins are broken with ATC. I tested this by disabling them in the source code. This however works only for InnoDB but not for XtraDB.

Code change:

--- storage/innobase/CMakeLists.txt.orig        2014-07-16 21:57:11.802113123 +0000
+++ storage/innobase/CMakeLists.txt     2014-07-16 21:50:37.334758927 +0000
@@ -126,11 +126,11 @@
 ENDIF()
 
 IF(HAVE_IB_GCC_ATOMIC_BUILTINS)
- ADD_DEFINITIONS(-DHAVE_IB_GCC_ATOMIC_BUILTINS=1)
+# ADD_DEFINITIONS(-DHAVE_IB_GCC_ATOMIC_BUILTINS=1)
 ENDIF()
 
 IF(HAVE_IB_GCC_ATOMIC_BUILTINS_64)
- ADD_DEFINITIONS(-DHAVE_IB_GCC_ATOMIC_BUILTINS_64=1)
+# ADD_DEFINITIONS(-DHAVE_IB_GCC_ATOMIC_BUILTINS_64=1)
 ENDIF()
 
  # either define HAVE_IB_ATOMIC_PTHREAD_T_GCC or not

In order to use stock InnoDB instead of XtraDB, one has to add this to my.cnf:

[mysqld]
 
#####use InnoDB plugin
ignore-builtin-innodb
plugin-load=innodb=ha_innodb.so

Right now I'm running an extensive benchmark with this build to check if the workaround is good.

Comment by Daniel Bartholomew [ 2014-07-17 ]

svoj is knee-deep in fixing this, so here's a quick update for those that were not on the IBM call this morning.

First, there's this from an email svoj sent:

InnoDB has it's own mutex implementation, which roughly look like:

typedef struct
{
  char lock;
  pthread_t thread_id;
} mutex_t;
 
void mutex_enter(mutex_t mutex)
{
  assert(mutex->thread_id != pthread_self());
  while (!__sync_lock_test_and_set(&mutex->lock, 1)) /* Acquire memory barrier */
  {
    mutex->thread_id= pthread_self();
    break;
  }
}
 
void mutex_exit(mutex_t mutex)
{
  assert(mutex->thread_id == pthread_self()); /* fails, thread_id = UNDEFINED */
  mutex->thread_id= UNDEFINED;
  __sync_lock_test_and_set(&mutex->lock, 0); /* Acquire memory barrier */
}

Fixed version of mutex_exit():

void mutex_exit(mutex_t mutex)
{
  assert(mutex->thread_id == pthread_self());
  mutex->thread_id= UNDEFINED;
  __sync_lock_release(&mutex->lock); /* Release memory barrier */
}

Acquire memory barrier in mutex_exit() looks obviously wrong. Memory writes
after such barrier are not guaranteed to become visible. I suspect it works
properly with gcc (4.4.7) because it creates either full or acquire-release
memory barrier, whereas ATC properly create acquire memory barrier. Do you
think it may be the case?

There is a comment around that code:

/* In theory __sync_lock_release should be used to release the lock.
Unfortunately, it does not work properly alone. The workaround is
that more conservative __sync_lock_test_and_set is used instead. */

Though it doesn't really explain why exactly it didn't work properly.

svoj and axel are now working with IBM to verify that the fix is correct and running tests.

Comment by Sergey Vojtovich [ 2014-07-17 ]

Another comment regarding buggy __sync_lock_release: http://bugs.mysql.com/bug.php?id=34175 (see [6 Feb 2008 19:21] Larry Zhou). It still doesn't give meaningful answer what's wrong with this function.

OTOH it says "GCC 4.1.1 with 64-bit opteron" and "There were AMD hardware bug, and GCC bug reported by Boehm". Probably we can assume it is fixed now.

Comment by Sergey Vojtovich [ 2014-07-18 ]

Sergei please review fix for this bug.

The patch has been pushed to 10.0.13:

revno: 4292
revision-id: svoj@mariadb.org-20140718111625-uch1ssbh8kf6i4ib
parent: jplindst@mariadb.org-20140715075753-24ptezu36xn4q37v
committer: Sergey Vojtovich <svoj@mariadb.org>
branch nick: 10.0
timestamp: Fri 2014-07-18 15:16:25 +0400
message:
  MDEV-6450 - MariaDB crash on Power8 when built with advance tool
              chain
  
  InnoDB mutex_exit() function calls __sync_test_and_set() to release
  the lock. According to manual this function is supposed to create
  "acquire" memory barrier whereas in fact we need "release" memory
  barrier at mutex_exit().
  
  The problem isn't repeatable with gcc because it creates
  "acquire-release" memory barrier for __sync_test_and_set().
  ATC creates just "acquire" barrier.
  
  Fixed by creating proper barrier at mutex_exit() by using
  __sync_lock_release() instead of __sync_test_and_set().\

Comment by Jan Lindström (Inactive) [ 2014-07-23 ]

If this fix is correct, it should be merged also to 5.5

Comment by Sergei Golubchik [ 2014-07-30 ]

svoj, the patch looks good, thanks.

But perhaps, you should do a similar change for Windows? InterlockedExchange is a full memory barrier. Windows has intrinsics with acquire and release semantics. For example, os_atomic_test_and_set_byte could be InterlockedExchangeAcquire, while os_atomic_lock_release_byte could be InterlockedAndRelease.

Comment by Sergey Vojtovich [ 2014-07-31 ]

Sergei, please review addition to the original patch. It implements your suggestion.

Comment by Sergei Golubchik [ 2014-07-31 ]

thanks, ok to push!

Comment by Sergey Vojtovich [ 2014-08-04 ]

Pushed additional patch to 10.0.13:

revno: 4319
revision-id: svoj@mariadb.org-20140731103105-6jnn5m8sazm5fvpg
parent: sergii@pisem.net-20140731161437-oxyzqaskmptm5ssv
committer: Sergey Vojtovich <svoj@mariadb.org>
branch nick: 10.0
timestamp: Thu 2014-07-31 14:31:05 +0400
message:
  MDEV-6450 - MariaDB crash on Power8 when built with advance
              tool chain
  
  This is an addition to the original patch. On Windows
  InterlockedExchange implies full memory barrier, whereas
  only acquire/release barriers required.

Generated at Thu Feb 08 07:12:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.