Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.13
    • 5.5.40, 10.0.14
    • None
    • None
    • power8, RH6.5

    Description

      NB: Fix for this bug also present in Stewart Smith' patchset: memory_barrier-experimental_5.6.4.diff.

      From errorlog:

      2014-07-31 21:02:00 ff6fb757190  InnoDB: Assertion failure in thread 17553455149456 in file sync0rw.cc line 690
      InnoDB: Failing assertion: !lock->recursive
      InnoDB: We intentionally generate a memory trap.
      ...
      stack_bottom = 0xff6fb756610 thread_stack 0x48000
      :0(000000ca.plt_call.MD5_Init)[0x109b476c]
      :0(000000ca.plt_call.MD5_Init)[0x103d7180]
      linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0xfff8fa30448]
      /opt/at7.0/lib64/power7/libc.so.6(gsignal-0x16f708)[0xfff8f1cf8f0]
      /opt/at7.0/lib64/power7/libc.so.6(abort-0x16dab4)[0xfff8f1d19c4]
      sync/sync0rw.cc:690(000000ca.plt_call.MD5_Init)[0x10124318]
      sync/sync0rw.cc:834(000000ca.plt_call.MD5_Init)[0x107d2b08]
      include/sync0rw.ic:917(pfs_rw_lock_x_lock_func)[0x108329b4]
      include/btr0sea.ic:81(000000ca.plt_call.MD5_Init)[0x1081c85c]
      include/btr0pcur.ic:485(btr_pcur_open_with_no_init_func)[0x107b3c74]
      handler/ha_innodb.cc:8374(000000ca.plt_call.MD5_Init)[0x106dfc4c]
      sql/handler.h:2888(000000ca.plt_call.MD5_Init)[0x103e8f64]
      sql/handler.cc:5520(000000ca.plt_call.MD5_Init)[0x103d7e6c]
      sql/handler.cc:2609(000000ca.plt_call.MD5_Init)[0x103de780]
      sql/sql_select.cc:18167(000000ca.plt_call.MD5_Init)[0x1023b53c]
      sql/table.h:1366(disable_keyread)[0x10115844]
      sql/sql_select.cc:3785(000000ca.plt_call.MD5_Init)[0x1011980c]
      sql/sql_select.cc:1338(optimize_inner)[0x1026b5fc]
      sql/sql_select.cc:3289(mysql_select)[0x10270180]
      ...
      Query (0xff68001a850): SELECT c FROM sbtest18 WHERE id=4968
      Connection ID (thread ID): 1287

      This is MariaDB-10.0, bzr revision 4308, compiled with ATC 7.0. Unlike previous (working) binaries, this one is using libaio.

      Attachments

        Issue Links

          Activity

            Below are my comments on InnoDB memory barriers framework. I will post additional comment on correctness of barriers when I complete review.

            • No action: non-atomic loads/stores of shared variables is evil. But nobody seem to care about it since all loads are 32-bit, which are known to be atomic.
            • No action: my_atomic.h doesn't need cmake probes - all checks are done via ifdef-s. To my taste it is more compact. But since InnoDB accepted memory barriers patch with cmake probes we probably shouldn't bother about it either.
            • No action: we wondered about reasons for reducing number of spins. There is a comment added along with rev.6004 to MySQL-5.6.20: internal counter for innodb_sync_spin_loops is adjusted because memory barrier is more expensive than an empty loop.
            • Action: we miss definition of HAVE_WINDOWS_MM_FENCE in CMakeLists.txt. See how it is handled in rev.6004 of MySQL-5.6.20.
            • # define os_rmb do { } while(0) and # define os_wmb do { } while(0)
              do {} while(0) is excessive, just #define os_rmb should be fine.
            svoj Sergey Vojtovich added a comment - Below are my comments on InnoDB memory barriers framework. I will post additional comment on correctness of barriers when I complete review. No action: non-atomic loads/stores of shared variables is evil. But nobody seem to care about it since all loads are 32-bit, which are known to be atomic. No action: my_atomic.h doesn't need cmake probes - all checks are done via ifdef-s. To my taste it is more compact. But since InnoDB accepted memory barriers patch with cmake probes we probably shouldn't bother about it either. No action: we wondered about reasons for reducing number of spins. There is a comment added along with rev.6004 to MySQL-5.6.20: internal counter for innodb_sync_spin_loops is adjusted because memory barrier is more expensive than an empty loop. Action: we miss definition of HAVE_WINDOWS_MM_FENCE in CMakeLists.txt. See how it is handled in rev.6004 of MySQL-5.6.20. # define os_rmb do { } while(0) and # define os_wmb do { } while(0) do {} while(0) is excessive, just #define os_rmb should be fine.

            On memory barriers in mutexes:

            - mutex_get_waiters() miss acquire memory barrier. This may cause
              mutex_exit_func() read stale 'waiters' value and be the reason
              for deadlock.
             
              There seem to be a workaround for that: srv_error_monitor_thread()
              is supposed to wake these stale threads every second. But if that's
              the case, we don't really need release memory barrier in
              mutex_set_waiters().
             
            - ib_mutex_test_and_set(): release memory barrier must not be needed,
              we hold mutex anyway and don't care at which point lock_word will
              become visible to other threads.
             
            - mutex_get_lock_word(): acquire memory barrier should not be needed.

            svoj Sergey Vojtovich added a comment - On memory barriers in mutexes: - mutex_get_waiters() miss acquire memory barrier. This may cause mutex_exit_func() read stale 'waiters' value and be the reason for deadlock.   There seem to be a workaround for that: srv_error_monitor_thread() is supposed to wake these stale threads every second. But if that's the case, we don't really need release memory barrier in mutex_set_waiters().   - ib_mutex_test_and_set(): release memory barrier must not be needed, we hold mutex anyway and don't care at which point lock_word will become visible to other threads.   - mutex_get_lock_word(): acquire memory barrier should not be needed.

            Neither of acquire memory barriers in sync_arr_cell_can_wake_up() should be needed.

            svoj Sergey Vojtovich added a comment - Neither of acquire memory barriers in sync_arr_cell_can_wake_up() should be needed.

            revno: 3413.65.7
            revision-id: monty@mariadb.org-20140819162835-sorv0ogd39f7mui8
            parent: knielsen@knielsen-hq.org-20140813134639-wk760plnzg5wu4x8
            committer: Michael Widenius <monty@mariadb.org>
            branch nick: maria-5.5
            timestamp: Tue 2014-08-19 19:28:35 +0300
            message:
            MDEV-6450 - MariaDB crash on Power8 when built with advance tool chain
             
            Part of this work is based on Stewart Smitch's memory barrier and lower priori
            patches for power8.
             
            - Added memory syncronization for innodb & xtradb for power8.
            - Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt
            - Added os_isync to fix a syncronization problem on power
            - Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur
              if log mutex is locked.
             
            All changes done both for InnoDB and Xtradb

            svoj Sergey Vojtovich added a comment - revno: 3413.65.7 revision-id: monty@mariadb.org-20140819162835-sorv0ogd39f7mui8 parent: knielsen@knielsen-hq.org-20140813134639-wk760plnzg5wu4x8 committer: Michael Widenius <monty@mariadb.org> branch nick: maria-5.5 timestamp: Tue 2014-08-19 19:28:35 +0300 message: MDEV-6450 - MariaDB crash on Power8 when built with advance tool chain   Part of this work is based on Stewart Smitch's memory barrier and lower priori patches for power8.   - Added memory syncronization for innodb & xtradb for power8. - Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt - Added os_isync to fix a syncronization problem on power - Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur if log mutex is locked.   All changes done both for InnoDB and Xtradb

            People

              monty Michael Widenius
              axel Axel Schwenke
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.