[MDEV-7026] Race in InnoDB/XtraDB mutex implementation can stall or hang the server - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Blocker
Resolution: Fixed
Affects Version/s: 5.5.40, 10.0.14
Fix Version/s: 5.5.41, 10.0.15
Component/s: Platform Power, Storage Engine - InnoDB, Storage Engine - XtraDB
Labels:
- buildbot

Description

This failure is seen in Buildbot. It is a race that occurs somewhat rarely,
but it can be consistently reproduced manually by running tests in a loop.

The failure can occur in different tests, for example like this:

http://buildbot.askmonty.org/buildbot/builders/bintar-trusty-p8-debug/builds/173

worker[2] mysql-test-run: WARNING: Timeout after mysql-test-run waited 180 seconds for the process [mysqld.2 - pid: 31323, winpid: 31323] to create a pid file.

rpl.rpl_incident 'stmt'                  w2 [ fail ]

        Test ended at 2014-10-31 12:04:27

CURRENT_TEST: rpl.rpl_incident

Failed to start mysqld.2

mysqltest failed but provided no output

The result from queries just before the failure was:

< snip >

call mtr.add_suppression("Slave SQL.*The incident LOST_EVENTS occured on the master.* 1590");

include/wait_for_slave_sql_error.inc [errno=1590]

Last_SQL_Error = 'The incident LOST_EVENTS occured on the master. Message: <none>'

**** On Slave ****

SELECT * FROM t1;

SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;

START SLAVE;

SELECT * FROM t1;

include/check_slave_is_running.inc

DROP TABLE t1;

include/rpl_end.inc

More failures are seen in a cross-reference search:

http://buildbot.askmonty.org/buildbot/reports/cross_reference#branch=&revision=&platform=-p8&dt=&bbnum=&typ=&info=&fail_name=&fail_variant=&fail_info_short=&fail_info_full=Failed+to+start+mysqld&limit=100

It looks like it only fails in debug builds on Power8, but this is not 100%
confirmed.

Looking into the error log, we consistently see that it hangs during creation
of the Doublewrite buffer:

InnoDB: Doublewrite buffer not found: creating new

Normally there should be a message after this that the doublewrite buffer was
created successfully, but when the error occurs, instead the server hangs at
this point.

Here is a GDB stacktrace from when it hangs:

Thread 1 (Thread 0x3fff82fb5d90 (LWP 74233)):

#0  0x00003fff82f22190 in __pthread_cond_wait (cond=0x10029bf3150, mutex=0x10029bf30c8) at pthread_cond_wait.c:187

#1  0x0000000010b4f0ac in safe_cond_wait (cond=0x10029bf3150, mp=0x10029bf30a0, file=0x10d1a638 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/os/os0sync.c", line=207)

    at /home/skysql/bzr/knielsen/5.5/mysys/thr_mutex.c:493

#2  0x0000000010a7eae4 in os_cond_wait (cond=0x10029bf3150, mutex=0x10029bf30a0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/os/os0sync.c:207

#3  0x0000000010a7f6d8 in os_event_wait_low (event=0x10029bf30a0, reset_sig_count=2) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/os/os0sync.c:609

#4  0x00000000108f544c in sync_array_wait_event (arr=0x100292584d0, index=0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/sync/sync0arr.c:459

#5  0x00000000108f9a20 in mutex_spin_wait (mutex=0x10029bf2ed8, file_name=0x10d01150 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c", line=985)

    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/sync/sync0sync.c:636

#6  0x00000000109e48d4 in mutex_enter_func (mutex=0x10029bf2ed8, file_name=0x10d01150 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c", line=985)

    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/include/sync0sync.ic:222

#7  0x00000000109e4a18 in pfs_mutex_enter_func (mutex=0x10029bf2ed8, file_name=0x10d01150 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c", line=985)

    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/include/sync0sync.ic:251

#8  0x00000000109ee980 in fil_mutex_enter_and_prepare_for_io (space_id=0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c:985

#9  0x00000000109f97c8 in _fil_io (type=11, sync=0, space_id=0, zip_size=0, block_offset=157, byte_offset=0, len=16384, buf=0x3fff7b538000, message=0x3fff7af542c0, trx=0x0)

    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c:5446

#10 0x00000000109933fc in buf_flush_write_block_low (bpage=0x3fff7af542c0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1192

#11 0x0000000010993c64 in buf_flush_page (buf_pool=0x10029c07798, bpage=0x3fff7af542c0, flush_type=BUF_FLUSH_LIST) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1400

#12 0x0000000010994010 in buf_flush_try_neighbors (space=0, offset=157, flush_type=BUF_FLUSH_LIST, n_flushed=48, n_to_flush=18446744073709551614)

    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1541

#13 0x00000000109944c8 in buf_flush_page_and_try_neighbors (bpage=0x3fff7af542c0, flush_type=BUF_FLUSH_LIST, n_to_flush=18446744073709551614, count=0x3fffd3831ed0)

    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1638

#14 0x00000000109949dc in buf_flush_flush_list_batch (buf_pool=0x10029c07798, min_n=18446744073709551614, lsn_limit=18446744073709551615) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1768

#15 0x0000000010994b48 in buf_flush_batch (buf_pool=0x10029c07798, flush_type=BUF_FLUSH_LIST, min_n=18446744073709551614, lsn_limit=18446744073709551615)

    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1848

#16 0x00000000109952a8 in buf_flush_list (min_n=18446744073709551614, lsn_limit=18446744073709551615) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:2065

#17 0x0000000010a53a1c in log_preflush_pool_modified_pages (new_oldest=18446744073709551615, sync=1) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/log/log0log.c:1874

#18 0x0000000010a548c8 in log_make_checkpoint_at (lsn=18446744073709551615, write_always=1) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/log/log0log.c:2313

#19 0x000000001091f0e8 in trx_sys_create_doublewrite_buf () at /home/skysql/bzr/knielsen/5.5/storage/xtradb/trx/trx0sys.c:407

#20 0x00000000108f2fbc in innobase_start_or_create_for_mysql () at /home/skysql/bzr/knielsen/5.5/storage/xtradb/srv/srv0start.c:2080

#21 0x0000000010867b90 in innobase_init (p=0x100292581c0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/handler/ha_innodb.cc:3225

#22 0x000000001047364c in ha_initialize_handlerton (plugin=0x100291a3d68) at /home/skysql/bzr/knielsen/5.5/sql/handler.cc:472

#23 0x0000000010255b68 in plugin_initialize (tmp_root=0x3fffd3833690, plugin=0x100291a3d68, argc=0x11126e70 <remaining_argc>, argv=0x10029128880, options_only=false)

    at /home/skysql/bzr/knielsen/5.5/sql/sql_plugin.cc:1372

#24 0x0000000010256718 in plugin_init (argc=0x11126e70 <remaining_argc>, argv=0x10029128880, flags=0) at /home/skysql/bzr/knielsen/5.5/sql/sql_plugin.cc:1613

#25 0x000000001015a080 in init_server_components () at /home/skysql/bzr/knielsen/5.5/sql/mysqld.cc:4337

#26 0x000000001015b808 in mysqld_main (argc=100, argv=0x10029128880) at /home/skysql/bzr/knielsen/5.5/sql/mysqld.cc:4934

#27 0x00000000101509d8 in main (argc=15, argv=0x3fffd3833f98) at /home/skysql/bzr/knielsen/5.5/sql/main.cc:25

Here is how I reproduced it, running the test in a loop and waiting for it to
hang:

bzr branch lp:maria/5.5

cd 5.5

export PATH=/opt/at7.0/bin:$PATH EXTRA_FLAGS="-O2 -Wuninitialized -DFORCE_INIT_OF_VARS" EXTRA_CONFIGS="--with-embedded-privilege-control" AM_EXTRA_MAKEFLAGS="VERBOSE=1"

cmake . -DCMAKE_BUILD_TYPE=Debug -DMYSQL_MAINTAINER_MODE=ON

time make -j4 package

(cd mysql-test && for i in `seq 40000` ; do echo XXX $i XXX ; time ./mtr --parallel=4 rpl.last_insert_id rpl.rpl_row_blob_innodb --verbose-restart || exit 1 ; done)

Attachments

Issue Links

is duplicated by

MDEV-7090 binlog.binlog_checkpoint failed in buildbot on power8-vlp01 (hang on startup)

Closed

relates to

MDEV-6478 MariaDB on Power8

Closed

MDEV-6654 Combine ib_mutex_t::waiters and ib_mutex_t::lock_word

Closed

MDEV-10341 InnoDB: Failing assertion: mutex_own(mutex) - mutex_exit_func - multiple MTR tests

Closed

Activity

Ascending order - Click to sort in descending order

Kristian Nielsen created issue - 2014-11-05 17:32

Sergey Vojtovich made changes - 2014-11-05 18:07

Field	Original Value	New Value
Link		This issue relates to ~~MDEV-6478~~ [ ~~MDEV-6478~~ ]

Kristian Nielsen made changes - 2014-11-05 18:41

Description

This failure is seen in Buildbot. It is a race that occurs somewhat rarely,
but it can be consistently reproduced manually by running tests in a loop.

The failure can occur in different tests, for example like this:

http://buildbot.askmonty.org/buildbot/builders/bintar-trusty-p8-debug/builds/173
{noformat}
worker[2] mysql-test-run: WARNING: Timeout after mysql-test-run waited 180 seconds for the process [mysqld.2 - pid: 31323, winpid: 31323] to create a pid file.
rpl.rpl_incident 'stmt' w2 [ fail ]
        Test ended at 2014-10-31 12:04:27

CURRENT_TEST: rpl.rpl_incident

Failed to start mysqld.2
mysqltest failed but provided no output
The result from queries just before the failure was:
< snip >
call mtr.add_suppression("Slave SQL.*The incident LOST_EVENTS occured on the master.* 1590");
include/wait_for_slave_sql_error.inc [errno=1590]
Last_SQL_Error = 'The incident LOST_EVENTS occured on the master. Message: <none>'
**** On Slave ****
SELECT * FROM t1;
a
1
2
3
SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
START SLAVE;
SELECT * FROM t1;
a
1
2
3
4
include/check_slave_is_running.inc
DROP TABLE t1;
include/rpl_end.inc
{noformat}

More failures are seen in a cross-reference search:

http://buildbot.askmonty.org/buildbot/reports/cross_reference#branch=&revision=&platform=-p8&dt=&bbnum=&typ=&info=&fail_name=&fail_variant=&fail_info_short=&fail_info_full=Failed+to+start+mysqld&limit=100

It looks like it only fails in debug builds on Power8, but this is not 100%
confirmed.

Looking into the error log, we consistently see that it hangs during creation
of the Doublewrite buffer:

{noformat}
InnoDB: Doublewrite buffer not found: creating new
{noformat}

Normally there should be a message after this that the doublewrite buffer was
created successfully, but when the error occurs, instead the server hangs at
this point.

Here is a GDB stacktrace from when it hangs:
{noformat}
Thread 1 (Thread 0x3fff82fb5d90 (LWP 74233)):
#0 0x00003fff82f22190 in __pthread_cond_wait (cond=0x10029bf3150, mutex=0x10029bf30c8) at pthread_cond_wait.c:187
#1 0x0000000010b4f0ac in safe_cond_wait (cond=0x10029bf3150, mp=0x10029bf30a0, file=0x10d1a638 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/os/os0sync.c", line=207)
    at /home/skysql/bzr/knielsen/5.5/mysys/thr_mutex.c:493
#2 0x0000000010a7eae4 in os_cond_wait (cond=0x10029bf3150, mutex=0x10029bf30a0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/os/os0sync.c:207
#3 0x0000000010a7f6d8 in os_event_wait_low (event=0x10029bf30a0, reset_sig_count=2) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/os/os0sync.c:609
#4 0x00000000108f544c in sync_array_wait_event (arr=0x100292584d0, index=0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/sync/sync0arr.c:459
#5 0x00000000108f9a20 in mutex_spin_wait (mutex=0x10029bf2ed8, file_name=0x10d01150 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c", line=985)
    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/sync/sync0sync.c:636
#6 0x00000000109e48d4 in mutex_enter_func (mutex=0x10029bf2ed8, file_name=0x10d01150 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c", line=985)
    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/include/sync0sync.ic:222
#7 0x00000000109e4a18 in pfs_mutex_enter_func (mutex=0x10029bf2ed8, file_name=0x10d01150 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c", line=985)
    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/include/sync0sync.ic:251
#8 0x00000000109ee980 in fil_mutex_enter_and_prepare_for_io (space_id=0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c:985
#9 0x00000000109f97c8 in _fil_io (type=11, sync=0, space_id=0, zip_size=0, block_offset=157, byte_offset=0, len=16384, buf=0x3fff7b538000, message=0x3fff7af542c0, trx=0x0)
    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c:5446
#10 0x00000000109933fc in buf_flush_write_block_low (bpage=0x3fff7af542c0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1192
#11 0x0000000010993c64 in buf_flush_page (buf_pool=0x10029c07798, bpage=0x3fff7af542c0, flush_type=BUF_FLUSH_LIST) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1400
#12 0x0000000010994010 in buf_flush_try_neighbors (space=0, offset=157, flush_type=BUF_FLUSH_LIST, n_flushed=48, n_to_flush=18446744073709551614)
    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1541
#13 0x00000000109944c8 in buf_flush_page_and_try_neighbors (bpage=0x3fff7af542c0, flush_type=BUF_FLUSH_LIST, n_to_flush=18446744073709551614, count=0x3fffd3831ed0)
    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1638
#14 0x00000000109949dc in buf_flush_flush_list_batch (buf_pool=0x10029c07798, min_n=18446744073709551614, lsn_limit=18446744073709551615) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1768
#15 0x0000000010994b48 in buf_flush_batch (buf_pool=0x10029c07798, flush_type=BUF_FLUSH_LIST, min_n=18446744073709551614, lsn_limit=18446744073709551615)
    at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:1848
#16 0x00000000109952a8 in buf_flush_list (min_n=18446744073709551614, lsn_limit=18446744073709551615) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/buf/buf0flu.c:2065
#17 0x0000000010a53a1c in log_preflush_pool_modified_pages (new_oldest=18446744073709551615, sync=1) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/log/log0log.c:1874
#18 0x0000000010a548c8 in log_make_checkpoint_at (lsn=18446744073709551615, write_always=1) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/log/log0log.c:2313
#19 0x000000001091f0e8 in trx_sys_create_doublewrite_buf () at /home/skysql/bzr/knielsen/5.5/storage/xtradb/trx/trx0sys.c:407
#20 0x00000000108f2fbc in innobase_start_or_create_for_mysql () at /home/skysql/bzr/knielsen/5.5/storage/xtradb/srv/srv0start.c:2080
#21 0x0000000010867b90 in innobase_init (p=0x100292581c0) at /home/skysql/bzr/knielsen/5.5/storage/xtradb/handler/ha_innodb.cc:3225
#22 0x000000001047364c in ha_initialize_handlerton (plugin=0x100291a3d68) at /home/skysql/bzr/knielsen/5.5/sql/handler.cc:472
#23 0x0000000010255b68 in plugin_initialize (tmp_root=0x3fffd3833690, plugin=0x100291a3d68, argc=0x11126e70 <remaining_argc>, argv=0x10029128880, options_only=false)
    at /home/skysql/bzr/knielsen/5.5/sql/sql_plugin.cc:1372
#24 0x0000000010256718 in plugin_init (argc=0x11126e70 <remaining_argc>, argv=0x10029128880, flags=0) at /home/skysql/bzr/knielsen/5.5/sql/sql_plugin.cc:1613
#25 0x000000001015a080 in init_server_components () at /home/skysql/bzr/knielsen/5.5/sql/mysqld.cc:4337
#26 0x000000001015b808 in mysqld_main (argc=100, argv=0x10029128880) at /home/skysql/bzr/knielsen/5.5/sql/mysqld.cc:4934
#27 0x00000000101509d8 in main (argc=15, argv=0x3fffd3833f98) at /home/skysql/bzr/knielsen/5.5/sql/main.cc:25
{noformat}

Here is how I reproduced it, running the test in a loop and waiting for it to
hang:
{noformat}
bzr branch lp:maria/5.5
cd 5.5
export PATH=/opt/at7.0/bin:$PATH EXTRA_FLAGS="-O2 -Wuninitialized -DFORCE_INIT_OF_VARS" EXTRA_CONFIGS="--with-embedded-privilege-control" AM_EXTRA_MAKEFLAGS="VERBOSE=1"
cmake . -DCMAKE_BUILD_TYPE=Debug -DMYSQL_MAINTAINER_MODE=ON
time make -j4 package
(cd mysql-test && for i in `seq 40000` ; do echo XXX $i XXX ; time ./mtr --parallel=4 rpl.last_insert_id rpl.rpl_row_blob_innodb --verbose-restart || exit 1 ; done)
{noformat}

Kristian Nielsen added a comment - 2014-11-05 19:03

The problem is apparently that a waiting thread is not woken up when an InnoDB mutes is released:

	while (!event->is_set && event->signal_count == reset_sig_count)

		os_cond_wait(&(event->cond_var), &(event->os_mutex));

(gdb) p event->is_set

$3 = 0

(gdb) p event->signal_count

$4 = 2

(gdb) p reset_signal_count

No symbol "reset_signal_count" in current context.

(gdb) p reset_sig_count

$5 = 2

(gdb) p  !event->is_set && event->signal_count == reset_sig_count

$6 = 1

The wakeup condition is not true...

(gdb) p *mutex

$8 = {event = 0x10029bf30a0, lock_word = 0 '\000', waiters = 1, list = {prev = 0x10029bf2f70, next = 0x11953928 <srv_misc_tmpfile_mutex>},

  cfile_name = 0x10d01150 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c", cline = 1673, thread_id = 18446744073709551615, magic_n = 979585, count_os_wait = 1, count_using = 1611,

  count_spin_loop = 6, count_spin_rounds = 175, count_os_yield = 3, lspent_time = 0, lmax_spent_time = 0, mutex_type = 0, cmutex_name = 0x10d02228 "&fil_system->mutex", pfs_psi = 0x3fff81ff2330}

(gdb) p mutex->lock_word

$9 = 0 '\000'

But the mutex is not locked (lock_word == 0).

Looks like some issue with the InnoDB mutex implementation...

Kristian Nielsen added a comment - 2014-11-05 19:03 The problem is apparently that a waiting thread is not woken up when an InnoDB mutes is released: while (!event->is_set && event->signal_count == reset_sig_count) os_cond_wait(&(event->cond_var), &(event->os_mutex)); (gdb) p event->is_set $3 = 0 (gdb) p event->signal_count $4 = 2 (gdb) p reset_signal_count No symbol "reset_signal_count" in current context. (gdb) p reset_sig_count $5 = 2 (gdb) p !event->is_set && event->signal_count == reset_sig_count $6 = 1 The wakeup condition is not true... (gdb) p *mutex $8 = {event = 0x10029bf30a0, lock_word = 0 '\000', waiters = 1, list = {prev = 0x10029bf2f70, next = 0x11953928 <srv_misc_tmpfile_mutex>}, cfile_name = 0x10d01150 "/home/skysql/bzr/knielsen/5.5/storage/xtradb/fil/fil0fil.c", cline = 1673, thread_id = 18446744073709551615, magic_n = 979585, count_os_wait = 1, count_using = 1611, count_spin_loop = 6, count_spin_rounds = 175, count_os_yield = 3, lspent_time = 0, lmax_spent_time = 0, mutex_type = 0, cmutex_name = 0x10d02228 "&fil_system->mutex", pfs_psi = 0x3fff81ff2330} (gdb) p mutex->lock_word $9 = 0 '\000' But the mutex is not locked (lock_word == 0). Looks like some issue with the InnoDB mutex implementation...

Sergey Vojtovich added a comment - 2014-11-06 09:14

There is a comment in mutex_exit_func():

        /* A problem: we assume that mutex_reset_lock word

        is a memory barrier, that is when we read the waiters

        field next, the read must be serialized in memory

        after the reset. A speculative processor might

        perform the read first, which could leave a waiting

        thread hanging indefinitely.

        Our current solution call every second

        sync_arr_wake_threads_if_sema_free()

        to wake up possible hanging threads if

        they are missed in mutex_signal_object. */

Even though we added "os_isync" between "lock_word store" and "waiters load", it is not enough. Only full memory barrier guarantees StoreLoad order.

So we still rely on the error monitor thread to wake up such stale threads. But error monitor thread is using mutexes itself and may get stuck similarly to other threads. To workaround this Monty added log_get_lsn_nowait(), which basically does trylock instead of lock. This is added in revision 4271 (monty@mariadb.org-20140819162835-sorv0ogd39f7mui8) of 5.5.

In this case I suspect error monitor thread is either not started or is stuck on different mutex (other than log_get_lsn mutex).

Sergey Vojtovich added a comment - 2014-11-06 09:14 There is a comment in mutex_exit_func(): /* A problem: we assume that mutex_reset_lock word is a memory barrier, that is when we read the waiters field next, the read must be serialized in memory after the reset. A speculative processor might perform the read first, which could leave a waiting thread hanging indefinitely. Our current solution call every second sync_arr_wake_threads_if_sema_free() to wake up possible hanging threads if they are missed in mutex_signal_object. */ Even though we added "os_isync" between "lock_word store" and "waiters load", it is not enough. Only full memory barrier guarantees StoreLoad order. So we still rely on the error monitor thread to wake up such stale threads. But error monitor thread is using mutexes itself and may get stuck similarly to other threads. To workaround this Monty added log_get_lsn_nowait(), which basically does trylock instead of lock. This is added in revision 4271 (monty@mariadb.org-20140819162835-sorv0ogd39f7mui8) of 5.5. In this case I suspect error monitor thread is either not started or is stuck on different mutex (other than log_get_lsn mutex).

Sergey Vojtovich added a comment - 2014-11-06 09:16

See also https://mariadb.atlassian.net/browse/MDEV-6654

Sergey Vojtovich added a comment - 2014-11-06 09:16 See also https://mariadb.atlassian.net/browse/MDEV-6654

Sergey Vojtovich made changes - 2014-11-06 09:17

Link

This issue relates to ~~MDEV-6654~~ [ ~~MDEV-6654~~ ]

Elena Stepanova made changes - 2014-11-06 14:47

Assignee

Jan Lindström [ jplindst ]

Kristian Nielsen added a comment - 2014-11-06 16:23 - edited

I wrote up a detailed analysis on maria-developers@ of what appears to be the
cause of the bug.

https://lists.launchpad.net/maria-developers/msg07860.html

Incidentally, it appears that this same bug is also what is
causing various strange hangs for a number of users lately.

I agree that in this case, the error monitor thread is not yet started (it was
not seen in the full stack trace). And in the user hang cases, that the error
monitor thread itself gets stuck on a mutex (I saw this in the case I
investigated).

However, I strongly disagree that the whole idea of using error monitor
thread, or trylock, is in any way an acceptable solution. If the mutex
implementation is broken, it should be fixed, not worked around with horrible
hacks.

Kristian Nielsen added a comment - 2014-11-06 16:23 - edited I wrote up a detailed analysis on maria-developers@ of what appears to be the cause of the bug. https://lists.launchpad.net/maria-developers/msg07860.html Incidentally, it appears that this same bug is also what is causing various strange hangs for a number of users lately. I agree that in this case, the error monitor thread is not yet started (it was not seen in the full stack trace). And in the user hang cases, that the error monitor thread itself gets stuck on a mutex (I saw this in the case I investigated). However, I strongly disagree that the whole idea of using error monitor thread, or trylock, is in any way an acceptable solution. If the mutex implementation is broken, it should be fixed, not worked around with horrible hacks.

Sergei Golubchik made changes - 2014-11-08 01:02

Fix Version/s		10.0 [ 16000 ]
Fix Version/s		5.5 [ 15800 ]

Sergei Golubchik made changes - 2014-11-08 01:02

Priority

Major [ 3 ]

Blocker [ 1 ]

Sergei Golubchik made changes - 2014-11-08 12:01

Affects Version/s		10.0.14 [ 17101 ]
Affects Version/s		5.5.40 [ 17100 ]
Affects Version/s	5.5 [ 15800 ]
Affects Version/s	10.0 [ 16000 ]

Jan Lindström (Inactive) made changes - 2014-11-10 12:55

Assignee

Jan Lindström [ jplindst ]

Sergey Vojtovich [ svoj ]

Sergey Vojtovich added a comment - 2014-11-12 13:44

Kristian, please review fix for this bug.

Sergey Vojtovich added a comment - 2014-11-12 13:44 Kristian, please review fix for this bug.

Sergey Vojtovich made changes - 2014-11-12 13:44

Assignee	Sergey Vojtovich [ svoj ]	Kristian Nielsen [ knielsen ]
Status	Open [ 1 ]	In Review [ 10002 ]

Sergey Vojtovich added a comment - 2014-11-12 13:50

...second patch:

Date: Wed, 12 Nov 2014 11:49:27 +0000 (UTC)

From: Sergey Vojtovich <svoj@mariadb.org>

To: commits@mariadb.org

Subject: [Commits] Rev 4346: MDEV-7026 - Occasional hang during startup on Power8 in lp:maria/5.5

Sergey Vojtovich added a comment - 2014-11-12 13:50 ...second patch: Date: Wed, 12 Nov 2014 11:49:27 +0000 (UTC) From: Sergey Vojtovich <svoj@mariadb.org> To: commits@mariadb.org Subject: [Commits] Rev 4346: MDEV-7026 - Occasional hang during startup on Power8 in lp:maria/5.5

Kristian Nielsen added a comment - 2014-11-14 12:23 - edited

Review sent to maria-developers@:

https://lists.launchpad.net/maria-developers/msg07896.html

Kristian Nielsen added a comment - 2014-11-14 12:23 - edited Review sent to maria-developers@: https://lists.launchpad.net/maria-developers/msg07896.html

Kristian Nielsen made changes - 2014-11-14 12:23

Assignee	Kristian Nielsen [ knielsen ]	Sergey Vojtovich [ svoj ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Sergey Vojtovich added a comment - 2014-11-18 10:13

Kristian, please review updated patch.

Sergey Vojtovich added a comment - 2014-11-18 10:13 Kristian, please review updated patch.

Sergey Vojtovich made changes - 2014-11-18 10:13

Assignee	Sergey Vojtovich [ svoj ]	Kristian Nielsen [ knielsen ]
Status	Stalled [ 10000 ]	In Review [ 10002 ]

Kristian Nielsen added a comment - 2014-11-18 18:01

New review sent to maria-developers@:

https://lists.launchpad.net/maria-developers/msg07913.html

Kristian Nielsen added a comment - 2014-11-18 18:01 New review sent to maria-developers@: https://lists.launchpad.net/maria-developers/msg07913.html

Kristian Nielsen made changes - 2014-11-18 18:01

Assignee	Kristian Nielsen [ knielsen ]	Sergey Vojtovich [ svoj ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Kristian Nielsen added a comment - 2014-11-19 13:47 - edited

Patch to clarify suggestions in previously made review:

http://lists.askmonty.org/pipermail/commits/2014-November/007050.html
lp:~maria-captains/maria/5.5-mdev7026

Kristian Nielsen added a comment - 2014-11-19 13:47 - edited Patch to clarify suggestions in previously made review: http://lists.askmonty.org/pipermail/commits/2014-November/007050.html lp:~maria-captains/maria/5.5-mdev7026

Kristian Nielsen made changes - 2014-11-19 13:48

Summary

Occasional hang during startup on Power8

Race in InnoDB/XtraDB mutex implementation can stall or hang the server

Sergey Vojtovich added a comment - 2014-11-25 10:33

Fixed in 10.0.15, 5.5.41:

revno: 4367

revision-id: knielsen@knielsen-hq.org-20141119125646-njq02h7irmu9s71b

parent: sergii@pisem.net-20141118231952-fk3o69075dn3ravv

committer: Kristian Nielsen <knielsen@knielsen-hq.org>

branch nick: mariadb-5.5

timestamp: Wed 2014-11-19 13:56:46 +0100

message:

  MDEV-7026: Race in InnoDB/XtraDB mutex implementation can stall or hang the server.

  The bug was that full memory barrier was missing in the code that ensures that

  a waiter on an InnoDB mutex will not go to sleep unless it is guaranteed to be

  woken up again by another thread currently holding the mutex. This made

  possible a race where a thread could get stuck waiting for a mutex that is in

  fact no longer locked. If that thread was also holding other critical locks,

  this could stall the entire server. There is an error monitor thread than can

  break the stall, it runs about once per second. But if the error monitor

  thread itself got stuck or was not running, then the entire server could hang

  infinitely.

  This was introduced on i386/amd64 platforms in 5.5.40 and 10.0.13 by an

  incorrect patch that tried to fix the similar problem for PowerPC.

  This commit reverts the incorrect PowerPC patch, and instead implements a fix

  for PowerPC that does not change i386/amd64 behaviour, making PowerPC work

  similarly to i386/amd64.

Sergey Vojtovich added a comment - 2014-11-25 10:33 Fixed in 10.0.15, 5.5.41: revno: 4367 revision-id: knielsen@knielsen-hq.org-20141119125646-njq02h7irmu9s71b parent: sergii@pisem.net-20141118231952-fk3o69075dn3ravv committer: Kristian Nielsen <knielsen@knielsen-hq.org> branch nick: mariadb-5.5 timestamp: Wed 2014-11-19 13:56:46 +0100 message: MDEV-7026: Race in InnoDB/XtraDB mutex implementation can stall or hang the server. The bug was that full memory barrier was missing in the code that ensures that a waiter on an InnoDB mutex will not go to sleep unless it is guaranteed to be woken up again by another thread currently holding the mutex. This made possible a race where a thread could get stuck waiting for a mutex that is in fact no longer locked. If that thread was also holding other critical locks, this could stall the entire server. There is an error monitor thread than can break the stall, it runs about once per second. But if the error monitor thread itself got stuck or was not running, then the entire server could hang infinitely. This was introduced on i386/amd64 platforms in 5.5.40 and 10.0.13 by an incorrect patch that tried to fix the similar problem for PowerPC. This commit reverts the incorrect PowerPC patch, and instead implements a fix for PowerPC that does not change i386/amd64 behaviour, making PowerPC work similarly to i386/amd64.

Sergey Vojtovich made changes - 2014-11-25 10:33

Component/s		Platform Power [ 10137 ]
Component/s		Storage Engine - InnoDB [ 10129 ]
Component/s		Storage Engine - XtraDB [ 10135 ]
Fix Version/s		5.5.41 [ 17600 ]
Fix Version/s		10.0.15 [ 17300 ]
Fix Version/s	5.5 [ 15800 ]
Fix Version/s	10.0 [ 16000 ]
Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

Elena Stepanova made changes - 2014-11-26 11:49

Link

This issue is duplicated by ~~MDEV-7126~~ [ ~~MDEV-7126~~ ]

Sergey Vojtovich made changes - 2014-12-01 13:34

Link

This issue is duplicated by ~~MDEV-7090~~ [ ~~MDEV-7090~~ ]

Daniel Black made changes - 2015-04-21 12:06

Link

This issue is duplicated by ~~MDEV-7126~~ [ ~~MDEV-7126~~ ]

Rasmus Johansson (Inactive) made changes - 2015-05-18 17:51

Workflow

MariaDB v2 [ 58212 ]

MariaDB v3 [ 64477 ]

Stewart Smith added a comment - 2015-07-17 08:02

The use of __sync_synchronize() here is incorrect as it gererates a sync() instruction which is a heavyweight barrier which is only needed if you're writing device drivers.

__sync_lock_release() is sufficient

realistically, __atomic_test_and_set() and __atomic_clear() should be used these days.

Stewart Smith added a comment - 2015-07-17 08:02 The use of __sync_synchronize() here is incorrect as it gererates a sync() instruction which is a heavyweight barrier which is only needed if you're writing device drivers. __sync_lock_release() is sufficient realistically, __atomic_test_and_set() and __atomic_clear() should be used these days.

Sergey Vojtovich added a comment - 2015-07-17 09:42

stewart-ibm, right. Good mutex implementation should never need to issue heavyweight sync.

In 5.5 and 10.0 old mutex implementation was kept intact as much as possible, for which this sync is a must.

In 10.1 there was an aim (~~MDEV-6654~~) to reimplement InnoDB mutexes. This was suspended because InnoDB also reimplemented mutexes in MySQL-5.7. We'll probably get back to this when we have InnoDB from 5.7.

Sergey Vojtovich added a comment - 2015-07-17 09:42 stewart-ibm , right. Good mutex implementation should never need to issue heavyweight sync. In 5.5 and 10.0 old mutex implementation was kept intact as much as possible, for which this sync is a must. In 10.1 there was an aim ( MDEV-6654 ) to reimplement InnoDB mutexes. This was suspended because InnoDB also reimplemented mutexes in MySQL-5.7. We'll probably get back to this when we have InnoDB from 5.7.

Stewart Smith added a comment - 2015-07-17 10:05

You should be able to remove the __sync_synchronize() and everything work. There's no reason for the sync instruction to be there.

Stewart Smith added a comment - 2015-07-17 10:05 You should be able to remove the __sync_synchronize() and everything work. There's no reason for the sync instruction to be there.

Kristian Nielsen added a comment - 2015-07-17 13:45

I do not plan to re-visit this issue myself. But be sure to read the mailing list discussion on this in the archive, there was a very detailed discussion on what kind of barriers are needed in the current implementation, and why.

Kristian Nielsen added a comment - 2015-07-17 13:45 I do not plan to re-visit this issue myself. But be sure to read the mailing list discussion on this in the archive, there was a very detailed discussion on what kind of barriers are needed in the current implementation, and why.

Sergey Vojtovich added a comment - 2015-07-17 18:04

stewart-ibm, we added __sync_synchronize() to make sure "lock_word" is stored before "waiters" are loaded. That is to ensure StoreLoad order.

You say it's not needed. If so, how can we ensure StoreLoad order in this case?

Sergey Vojtovich added a comment - 2015-07-17 18:04 stewart-ibm , we added __sync_synchronize() to make sure "lock_word" is stored before "waiters" are loaded. That is to ensure StoreLoad order. You say it's not needed. If so, how can we ensure StoreLoad order in this case?

Jan Lindström (Inactive) made changes - 2016-07-11 05:07

Link

This issue relates to ~~MDEV-10341~~ [ ~~MDEV-10341~~ ]

Sergei Golubchik made changes - 2021-12-06 21:40

Workflow

MariaDB v3 [ 64477 ]

MariaDB v4 [ 148415 ]

People

Assignee:: Sergey Vojtovich

Reporter:: Kristian Nielsen

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 2014-11-05 17:32

Updated:: 2016-07-11 05:07

Resolved:: 2014-11-25 10:33

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration