Details

      Description

      Hi there,

      Every few days our cluster stalls. You always know doomsday has arrived, when this message appears in the log:

      May 17 06:46:25 uhu1 mysqld[27227]: WSREP: BF lock wait long
      

      This message then repeats a lot and other weird stuff finally appears (see below). The cluster never recovers. We always need to restart all nodes. I find this especially surprising, since there are various timeouts defined. If there is a deadlock, I wonder why it doesnt resolve automatically after some time by aborting with some error message. Googling for this message to find out whether it might be an application problem didnt yield any results either.

      May 17 06:46:26 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:27 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:27 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:28 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:29 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:30 uhu1 mysqld[27227]: message repeated 6 times: [ WSREP: BF lock wait long]
      May 17 06:46:31 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:32 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:33 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:34 uhu1 mysqld[27227]: message repeated 8 times: [ WSREP: BF lock wait long]
      May 17 06:46:35 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:36 uhu1 mysqld[27227]: message repeated 2 times: [ WSREP: BF lock wait long]
      May 17 06:46:37 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:38 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:39 uhu1 mysqld[27227]: message repeated 2 times: [ WSREP: BF lock wait long]
      May 17 06:46:40 uhu1 mysqld[27227]: WSREP: BF lock wait long
      May 17 06:46:40 uhu1 mysqld[27227]: =====================================
      May 17 06:46:40 uhu1 mysqld[27227]: 2017-05-17 06:46:40 0x7f52c3cdd700 INNODB MONITOR OUTPUT
      May 17 06:46:40 uhu1 mysqld[27227]: =====================================
      May 17 06:46:40 uhu1 mysqld[27227]: Per second averages calculated from the last 16 seconds
      May 17 06:46:40 uhu1 mysqld[27227]: -----------------
      May 17 06:46:40 uhu1 mysqld[27227]: BACKGROUND THREAD
      May 17 06:46:40 uhu1 mysqld[27227]: -----------------
      May 17 06:46:40 uhu1 mysqld[27227]: srv_master_thread loops: 570080 srv_active, 0 srv_shutdown, 120059 srv_idle
      May 17 06:46:40 uhu1 mysqld[27227]: srv_master_thread log flush and writes: 690137
      May 17 06:46:40 uhu1 mysqld[27227]: ----------
      May 17 06:46:40 uhu1 mysqld[27227]: SEMAPHORES
      May 17 06:46:40 uhu1 mysqld[27227]: ----------
      May 17 06:46:40 uhu1 mysqld[27227]: OS WAIT ARRAY INFO: reservation count 2567993
      May 17 06:46:40 uhu1 mysqld[27227]: OS WAIT ARRAY INFO: signal count 1741581
      May 17 06:46:40 uhu1 mysqld[27227]: RW-shared spins 0, rounds 1276685, OS waits 365942
      May 17 06:46:40 uhu1 mysqld[27227]: RW-excl spins 0, rounds 16000314, OS waits 198061
      May 17 06:46:40 uhu1 mysqld[27227]: RW-sx spins 210079, rounds 4201641, OS waits 50893
      May 17 06:46:40 uhu1 mysqld[27227]: Spin rounds per wait: 1276685.00 RW-shared, 16000314.00 RW-excl, 20.00 RW-sx
      May 17 06:46:40 uhu1 mysqld[27227]: ------------------------
      May 17 06:46:40 uhu1 mysqld[27227]: LATEST DETECTED DEADLOCK
      May 17 06:46:40 uhu1 mysqld[27227]: ------------------------
      May 17 06:46:40 uhu1 mysqld[27227]: 2017-05-17 03:24:35 0x7f52add94700
      May 17 06:46:40 uhu1 mysqld[27227]: *** (1) TRANSACTION:
      May 17 06:46:40 uhu1 mysqld[27227]: TRANSACTION 295857376, ACTIVE 0 sec inserting
      May 17 06:46:40 uhu1 mysqld[27227]: mysql tables in use 1, locked 1
      May 17 06:46:40 uhu1 mysqld[27227]: LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
      May 17 06:46:40 uhu1 mysqld[27227]: MySQL thread id 1704966, OS thread handle 0x7f52ae768700, query id 400303172 localhost 127.0.0.1 eposservicescom update
      May 17 06:46:40 uhu1 mysqld[27227]: INSERT INTO semaphore (name, value, expire) VALUES ('state:Drupal\\Core\\Cache\\CacheCollector', '1193996601591ba65393eb33.09632175', '1494984305.6058')
      May 17 06:46:40 uhu1 mysqld[27227]: *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
      May 17 06:46:40 uhu1 mysqld[27227]: RECORD LOCKS space id 4205 page no 3 n bits 72 index PRIMARY of table "eposservicescom"."semaphore" trx id 295857376 lock_mode X locks rec but not gap waiting
      May 17 06:46:40 uhu1 mysqld[27227]: Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
      May 17 06:46:40 uhu1 mysqld[27227]:  0: len 30; hex 73746174653a44727570616c5c436f72655c43616368655c436163686543; asc state:Drupal\Core\Cache\CacheC; (total 38 bytes);
      May 17 06:46:40 uhu1 mysqld[27227]:  1: len 6; hex 000011a26cb5; asc     l ;;
      May 17 06:46:40 uhu1 mysqld[27227]:  2: len 7; hex 40000006a82965; asc @    )e;;
      May 17 06:46:40 uhu1 mysqld[27227]:  3: len 30; hex 36353632373439313735393162613635316337623362302e313335333631; asc 656274917591ba651c7b3b0.135361; (total 32 bytes);
      May 17 06:46:40 uhu1 mysqld[27227]:  4: len 8; hex 4850f49be946d641; asc HP   F A;;
      May 17 06:46:40 uhu1 mysqld[27227]: *** (2) TRANSACTION:
      May 17 06:46:40 uhu1 mysqld[27227]: TRANSACTION 295857375, ACTIVE 0 sec inserting
      May 17 06:46:40 uhu1 mysqld[27227]: mysql tables in use 1, locked 1
      May 17 06:46:40 uhu1 mysqld[27227]: 3 lock struct(s), heap size 1136, 2 row lock(s)
      May 17 06:46:40 uhu1 mysqld[27227]: MySQL thread id 1704967, OS thread handle 0x7f52add94700, query id 400303173 localhost 127.0.0.1 eposservicescom update
      May 17 06:46:40 uhu1 mysqld[27227]: INSERT INTO semaphore (name, value, expire) VALUES ('state:Drupal\\Core\\Cache\\CacheCollector', '1474565599591ba65393ebe9.44223691', '1494984305.6058')
      May 17 06:46:40 uhu1 mysqld[27227]: *** (2) HOLDS THE LOCK(S):
      May 17 06:46:40 uhu1 mysqld[27227]: RECORD LOCKS space id 4205 page no 3 n bits 72 index PRIMARY of table "eposservicescom"."semaphore" trx id 295857375 lock mode S locks rec but not gap
      May 17 06:46:40 uhu1 mysqld[27227]: Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
      May 17 06:46:40 uhu1 mysqld[27227]:  0: len 30; hex 73746174653a44727570616c5c436f72655c43616368655c436163686543; asc state:Drupal\Core\Cache\CacheC; (total 38 bytes);
      May 17 06:46:40 uhu1 mysqld[27227]:  1: len 6; hex 000011a26cb5; asc     l ;;
      May 17 06:46:40 uhu1 mysqld[27227]:  2: len 7; hex 40000006a82965; asc @    )e;;
      May 17 06:46:40 uhu1 mysqld[27227]:  3: len 30; hex 36353632373439313735393162613635316337623362302e313335333631; asc 656274917591ba651c7b3b0.135361; (total 32 bytes);
      May 17 06:46:40 uhu1 mysqld[27227]:  4: len 8; hex 4850f49be946d641; asc HP   F A;;
      May 17 06:46:40 uhu1 mysqld[27227]: *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
      May 17 06:46:40 uhu1 mysqld[27227]: RECORD LOCKS space id 4205 page no 3 n bits 72 index PRIMARY of table "eposservicescom"."semaphore" trx id 295857375 lock_mode X locks rec but not gap waiting
      May 17 06:46:40 uhu1 mysqld[27227]: Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 32
      May 17 06:46:40 uhu1 mysqld[27227]:  0: len 30; hex 73746174653a44727570616c5c436f72655c43616368655c436163686543; asc state:Drupal\Core\Cache\CacheC; (total 38 bytes);
      May 17 06:46:40 uhu1 mysqld[27227]:  1: len 6; hex 000011a26cb5; asc     l ;;
      May 17 06:46:40 uhu1 mysqld[27227]:  2: len 7; hex 40000006a82965; asc @    )e;;
      May 17 06:46:40 uhu1 mysqld[27227]:  3: len 30; hex 36353632373439313735393162613635316337623362302e313335333631; asc 656274917591ba651c7b3b0.135361; (total 32 bytes);
      May 17 06:46:40 uhu1 mysqld[27227]:  4: len 8; hex 4850f49be946d641; asc HP   F A;;
      May 17 06:46:40 uhu1 mysqld[27227]: *** WE ROLL BACK TRANSACTION (2)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jplindst Jan Lindström
                Reporter:
                rgpublic Ranjan Ghosh
              • Votes:
                19 Vote for this issue
                Watchers:
                38 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: