[MDEV-6489] rpl.rpl_insert, rpl.rpl_insert_delayed and main.mysqlslap fail on PPC64 Created: 2014-07-25  Updated: 2014-07-31  Resolved: 2014-07-30

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.12
Fix Version/s: 10.0.13

Type: Bug Priority: Critical
Reporter: Sergey Vojtovich Assignee: Sergey Vojtovich
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
PartOf
is part of MDEV-6478 MariaDB on Power8 Closed

 Description   

All of these tests execute mysqlslap, which deadlocks. Below is simplified code from mysqlslap which also deadlocks on PPC64:

#include <pthread.h>
 
pthread_mutex_t mutex;
pthread_cond_t cond;
int master_wakeup;
 
static void *thread_start(void *arg)
{
  pthread_mutex_lock(&mutex);
  while (master_wakeup)
    pthread_cond_wait(&cond, &mutex);
  pthread_mutex_unlock(&mutex);
 
  return 0;
}
 
int main(void)
{
  int i, t;
  pthread_t thread_id[5];
 
  pthread_mutex_init(&mutex, 0);
  pthread_cond_init(&cond, 0);
 
  for (i= 0; i < 1000; i++)
  {
    master_wakeup= 1;
 
    for (t= 0; t < 5; t++)
      if (pthread_create(&thread_id[t], 0, thread_start, 0))
        return 1;
 
    pthread_mutex_lock(&mutex);
    master_wakeup= 0;
    pthread_mutex_unlock(&mutex);
    pthread_cond_broadcast(&cond);
 
    for (t= 0; t < 5; t++)
      pthread_join(thread_id[t], 0);
  }
 
  pthread_mutex_destroy(&mutex);
  pthread_cond_destroy(&cond);
 
  return 0;
}

If we move broadcast call up one line so that it is protected by the mutex, the program won't deadlock. I believe there should be no difference when we call broadcase, because the manual says:

  These functions atomically release mutex and cause the calling thread to block
  on the condition variable cond; atomically here means "atomically with respect
  to access by another thread to the mutex and then the condition variable".
  That is, if another thread is able to acquire the mutex after the
  about-to-block thread has released it, then a subsequent call to
  pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave
  as if it were issued after the about-to-block thread has blocked.



 Comments   
Comment by Sergey Vojtovich [ 2014-07-25 ]

Sergei, please review fix for this bug.

A patch has been pushed to 10.0.13:

revno: 4306
revision-id: svoj@mariadb.org-20140725130247-cl64fv8g6g2ydbq7
parent: jplindst@mariadb.org-20140725073016-8y0e2u8zxd0x4z7t
committer: Sergey Vojtovich <svoj@mariadb.org>
branch nick: 10.0
timestamp: Fri 2014-07-25 17:02:47 +0400
message:
  MDEV-6489 - rpl.rpl_insert, rpl.rpl_insert_delayed and
              main.mysqlslap fail on PPC64
  
  There seem to be a bug on Power8 which doesn't guarantee
  a signal to be delivered to waiting thread if broadcast
  is called outside of mutex.
  
  For now workaround it by calling broadcast while mutex is
  still held.

Comment by Sergei Golubchik [ 2014-07-30 ]

Also, the manpage for pthread_cond_broadcast() is very explicit about it:

The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal().

Comment by Sergei Golubchik [ 2014-07-30 ]

ok to push

Generated at Thu Feb 08 07:12:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.