Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6489

rpl.rpl_insert, rpl.rpl_insert_delayed and main.mysqlslap fail on PPC64

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.0.12
    • 10.0.13
    • None
    • None

    Description

      All of these tests execute mysqlslap, which deadlocks. Below is simplified code from mysqlslap which also deadlocks on PPC64:

      #include <pthread.h>
       
      pthread_mutex_t mutex;
      pthread_cond_t cond;
      int master_wakeup;
       
      static void *thread_start(void *arg)
      {
        pthread_mutex_lock(&mutex);
        while (master_wakeup)
          pthread_cond_wait(&cond, &mutex);
        pthread_mutex_unlock(&mutex);
       
        return 0;
      }
       
      int main(void)
      {
        int i, t;
        pthread_t thread_id[5];
       
        pthread_mutex_init(&mutex, 0);
        pthread_cond_init(&cond, 0);
       
        for (i= 0; i < 1000; i++)
        {
          master_wakeup= 1;
       
          for (t= 0; t < 5; t++)
            if (pthread_create(&thread_id[t], 0, thread_start, 0))
              return 1;
       
          pthread_mutex_lock(&mutex);
          master_wakeup= 0;
          pthread_mutex_unlock(&mutex);
          pthread_cond_broadcast(&cond);
       
          for (t= 0; t < 5; t++)
            pthread_join(thread_id[t], 0);
        }
       
        pthread_mutex_destroy(&mutex);
        pthread_cond_destroy(&cond);
       
        return 0;
      }

      If we move broadcast call up one line so that it is protected by the mutex, the program won't deadlock. I believe there should be no difference when we call broadcase, because the manual says:

        These functions atomically release mutex and cause the calling thread to block
        on the condition variable cond; atomically here means "atomically with respect
        to access by another thread to the mutex and then the condition variable".
        That is, if another thread is able to acquire the mutex after the
        about-to-block thread has released it, then a subsequent call to
        pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave
        as if it were issued after the about-to-block thread has blocked.

      Attachments

        Issue Links

          Activity

            Sergei, please review fix for this bug.

            A patch has been pushed to 10.0.13:

            revno: 4306
            revision-id: svoj@mariadb.org-20140725130247-cl64fv8g6g2ydbq7
            parent: jplindst@mariadb.org-20140725073016-8y0e2u8zxd0x4z7t
            committer: Sergey Vojtovich <svoj@mariadb.org>
            branch nick: 10.0
            timestamp: Fri 2014-07-25 17:02:47 +0400
            message:
              MDEV-6489 - rpl.rpl_insert, rpl.rpl_insert_delayed and
                          main.mysqlslap fail on PPC64
              
              There seem to be a bug on Power8 which doesn't guarantee
              a signal to be delivered to waiting thread if broadcast
              is called outside of mutex.
              
              For now workaround it by calling broadcast while mutex is
              still held.

            svoj Sergey Vojtovich added a comment - Sergei, please review fix for this bug. A patch has been pushed to 10.0.13: revno: 4306 revision-id: svoj@mariadb.org-20140725130247-cl64fv8g6g2ydbq7 parent: jplindst@mariadb.org-20140725073016-8y0e2u8zxd0x4z7t committer: Sergey Vojtovich <svoj@mariadb.org> branch nick: 10.0 timestamp: Fri 2014-07-25 17:02:47 +0400 message: MDEV-6489 - rpl.rpl_insert, rpl.rpl_insert_delayed and main.mysqlslap fail on PPC64 There seem to be a bug on Power8 which doesn't guarantee a signal to be delivered to waiting thread if broadcast is called outside of mutex. For now workaround it by calling broadcast while mutex is still held.

            Also, the manpage for pthread_cond_broadcast() is very explicit about it:

            The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal().

            serg Sergei Golubchik added a comment - Also, the manpage for pthread_cond_broadcast() is very explicit about it: The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal() .

            ok to push

            serg Sergei Golubchik added a comment - ok to push

            People

              svoj Sergey Vojtovich
              svoj Sergey Vojtovich
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.