Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19803

Long semaphore wait error on galera.MW-388

Details

    Description

      galera.MW-388 'innodb'                   w1 [ fail ]  Found warnings/errors in server log file!
              Test ended at 2019-06-18 03:46:15
      line
      2019-06-18  3:45:15 140319937001216 [Warning] InnoDB: A long semaphore wait:
      2019-06-18  3:45:46 140319937001216 [Warning] InnoDB: A long semaphore wait:
      ^ Found warnings in /dev/shm/var/1/log/mysqld.1.err
      ok
       
       - saving '/dev/shm/var/1/log/galera.MW-388-innodb/' to '/dev/shm/var/log/galera.MW-388-innodb/'
       
      Retrying test galera.MW-388, attempt(2/3)...
       
      worker[1] > Restart  - not started
      worker[1] > Restart  - not started
      galera.galera_as_master 'innodb'         w2 [ pass ]   2871
      worker[2] > Restart [mysqld.1 - pid: 9007, winpid: 9007] - using different config file
      worker[2] > Restart [mysqld.2 - pid: 9009, winpid: 9009] - using different config file
      worker[2] > Restart [mysqld.3 - pid: 9011, winpid: 9011] - using different config file
      galera.MW-388 'innodb'                   w1 [ retry-pass ]   2063
      

      Attachments

        Issue Links

          Activity

            jplindst Jan Lindström (Inactive) added a comment - - edited http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/2094/steps/mtr-galera/logs/stdio http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/2094/steps/mtr-galera/logs/mysqld.1.err.1

            CURRENT_TEST: galera.MW-388
            2019-06-18  3:45:15 140319937001216 [Warning] InnoDB: A long semaphore wait:
            --Thread 140320104756992 has waited at buf0flu.cc line 1229 for 241.00 seconds the semaphore:
            SX-lock on RW-latch at 0x7f9ee1bf72f0 created in file buf0buf.cc line 1496
            a writer (thread id 140320477054720) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file row0ins.cc line 2599
            Last time write locked in file row0ins.cc line 2599
            2019-06-18  3:45:15 140319937001216 [Note] InnoDB: A semaphore wait:
            --Thread 140320104756992 has waited at buf0flu.cc line 1229 for 241.00 seconds the semaphore:
            SX-lock on RW-latch at 0x7f9ee1bf72f0 created in file buf0buf.cc line 1496
            a writer (thread id 140320477054720) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file row0ins.cc line 2599
            Last time write locked in file row0ins.cc line 2599
            InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
            InnoDB: Pending reads 0, writes 0
             
            =====================================
            2019-06-18 03:45:23 0x7f9ed17fa700 INNODB MONITOR OUTPUT
            =====================================
            Per second averages calculated from the last 60 seconds
            -----------------
            BACKGROUND THREAD
            -----------------
            srv_master_thread loops: 18 srv_active, 0 srv_shutdown, 301 srv_idle
            srv_master_thread log flush and writes: 319
            ----------
            SEMAPHORES
            ----------
            -------------
            RW-LATCH INFO
            -------------
            RW-LOCK: 0x7f9ee1bf7418 
            Locked: thread 140320477054720 file row0ins.cc line 2599  S-LOCK
            RW-LOCK: 0x7f9ee1bf72f0  Waiters for the lock exist
            Locked: thread 140320477054720 file row0ins.cc line 2599  X-LOCK
            Total number of rw-locks 1075
            OS WAIT ARRAY INFO: reservation count 285
            --Thread 140320104756992 has waited at buf0flu.cc line 1229 for 249.00 seconds the semaphore:
            SX-lock on RW-latch at 0x7f9ee1bf72f0 created in file buf0buf.cc line 1496
            a writer (thread id 140320477054720) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time read locked in file row0ins.cc line 2599
            Last time write locked in file row0ins.cc line 2599
            OS WAIT ARRAY INFO: signal count 201
            RW-shared spins 0, rounds 229, OS waits 97
            RW-excl spins 0, rounds 288, OS waits 9
            RW-sx spins 0, rounds 0, OS waits 0
            Spin rounds per wait: 229.00 RW-shared, 288.00 RW-excl, 0.00 RW-sx
            FAIL TO OBTAIN LOCK MUTEX, SKIP LOCK INFO PRINTING
            

            jplindst Jan Lindström (Inactive) added a comment - CURRENT_TEST: galera.MW-388 2019-06-18 3:45:15 140319937001216 [Warning] InnoDB: A long semaphore wait: --Thread 140320104756992 has waited at buf0flu.cc line 1229 for 241.00 seconds the semaphore: SX-lock on RW-latch at 0x7f9ee1bf72f0 created in file buf0buf.cc line 1496 a writer (thread id 140320477054720) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file row0ins.cc line 2599 Last time write locked in file row0ins.cc line 2599 2019-06-18 3:45:15 140319937001216 [Note] InnoDB: A semaphore wait: --Thread 140320104756992 has waited at buf0flu.cc line 1229 for 241.00 seconds the semaphore: SX-lock on RW-latch at 0x7f9ee1bf72f0 created in file buf0buf.cc line 1496 a writer (thread id 140320477054720) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file row0ins.cc line 2599 Last time write locked in file row0ins.cc line 2599 InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info: InnoDB: Pending reads 0, writes 0   ===================================== 2019-06-18 03:45:23 0x7f9ed17fa700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 60 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 18 srv_active, 0 srv_shutdown, 301 srv_idle srv_master_thread log flush and writes: 319 ---------- SEMAPHORES ---------- ------------- RW-LATCH INFO ------------- RW-LOCK: 0x7f9ee1bf7418 Locked: thread 140320477054720 file row0ins.cc line 2599 S-LOCK RW-LOCK: 0x7f9ee1bf72f0 Waiters for the lock exist Locked: thread 140320477054720 file row0ins.cc line 2599 X-LOCK Total number of rw-locks 1075 OS WAIT ARRAY INFO: reservation count 285 --Thread 140320104756992 has waited at buf0flu.cc line 1229 for 249.00 seconds the semaphore: SX-lock on RW-latch at 0x7f9ee1bf72f0 created in file buf0buf.cc line 1496 a writer (thread id 140320477054720) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file row0ins.cc line 2599 Last time write locked in file row0ins.cc line 2599 OS WAIT ARRAY INFO: signal count 201 RW-shared spins 0, rounds 229, OS waits 97 RW-excl spins 0, rounds 288, OS waits 9 RW-sx spins 0, rounds 0, OS waits 0 Spin rounds per wait: 229.00 RW-shared, 288.00 RW-excl, 0.00 RW-sx FAIL TO OBTAIN LOCK MUTEX, SKIP LOCK INFO PRINTING

            marko I do not immediately see how this could be related only to Galera.

            jplindst Jan Lindström (Inactive) added a comment - marko I do not immediately see how this could be related only to Galera.

            jplindst, please post the stack traces of all threads that participate in the hang, or show a non-Galera test where InnoDB hangs in this way.

            Which exact versions are affected? Did you really see this in all of 10.2, 10.3, 10.4? Please always identify the exact source code revisions.

            I believe that such hangs in Galera can occur due to MDEV-18464 not having been fixed.

            marko Marko Mäkelä added a comment - jplindst , please post the stack traces of all threads that participate in the hang, or show a non-Galera test where InnoDB hangs in this way. Which exact versions are affected? Did you really see this in all of 10.2, 10.3, 10.4? Please always identify the exact source code revisions. I believe that such hangs in Galera can occur due to MDEV-18464 not having been fixed.

            Unfortunately, I do not have full stack trace. I can't repeat this locally.

            Repeats at:

            Test does not use KILL command and semaphores on report are not lock_sys or trx_mutex so MDEV-18464 is not likely the cause.

            jplindst Jan Lindström (Inactive) added a comment - Unfortunately, I do not have full stack trace. I can't repeat this locally. Repeats at: 10.3 192aa295b4bc877d0fed24154c57304c3f789179 http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/2107 10.4 efbfcc8b73d9fd8bafd86ab387e0c1edd39e208b http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/2106 10.2 71eea0c3fb87f37f6377c9bdad9339b664b11d0f http://buildbot.askmonty.org/buildbot/builders/kvm-rpm-centos74-amd64-debug/builds/2094 Test does not use KILL command and semaphores on report are not lock_sys or trx_mutex so MDEV-18464 is not likely the cause.

            Hello,

            I've had something similar happen after an upgrade from 10.3.16 -> 10.3.18. The first galera node was brought up without problems, but the remaining two failed to receive IST while the logs were inundated with these "semaphore" warnings. In the end, I had to kill the server and start it again several times for the IST to complete and re-join the cluster.

            I'm posting excerpts from the log here in case it is of any value. Do you think it is the same issue? Or should open a new one? Thanks!

            2019-09-24  1:11:06 2 [Note] WSREP: Receiving IST: 9009 writesets, seqnos 4727456747-4727465756
            2019-09-24  1:11:06 0 [Note] WSREP: Receiving IST...  0.0% (   0/9009 events) complete.
            2019-09-24  1:11:06 2 [Warning] InnoDB: Cannot add field `as_contact` in table `kamailio`.`topos_t` because after adding it, the row size is 8193 which is greater than maximum allowed size (8126) for a record on index leaf page.
            2019-09-24  1:11:06 2 [Warning] InnoDB: Cannot add field `s_rr` in table `kamailio`.`topos_d` because after adding it, the row size is 8166 which is greater than maximum allowed size (8126) for a record on index leaf page.
            2019-09-24  1:11:07 0 [Note] WSREP: (086791d9, 'tcp://172.30.20.129:4567') turning message relay requesting off
             
            =====================================
            2019-09-24 01:11:26 0x7fa92b456700 INNODB MONITOR OUTPUT
            =====================================
            Per second averages calculated from the last 20 seconds
            -----------------
            BACKGROUND THREAD
            -----------------
            srv_master_thread loops: 1 srv_active, 0 srv_shutdown, 0 srv_idle
            srv_master_thread log flush and writes: 0
            ----------
            SEMAPHORES
            ----------
            OS WAIT ARRAY INFO: reservation count 210
            --Thread 140363891283712 has waited at ibuf0ibuf.cc line 4465 for 19.00 seconds the semaphore:
            S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            --Thread 140363899676416 has waited at ibuf0ibuf.cc line 4465 for 19.00 seconds the semaphore:
            S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            

            Similar logs follow after every INNODB MONITOR OUTPUT. After a while, I start getting these in between InnoDB Monitor messages:

            2019-09-24  1:15:07 0 [Warning] InnoDB: A long semaphore wait:
            --Thread 140392823322368 has waited at lock0lock.cc line 6261 for 241.00 seconds the semaphore:
            Mutex at 0x56329c07c500, Mutex LOCK_SYS created lock0lock.cc:460, lock var 2
             
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140363891283712 has waited at ibuf0ibuf.cc line 4465 for 240.00 seconds the semaphore:
            S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140363899676416 has waited at ibuf0ibuf.cc line 4465 for 240.00 seconds the semaphore:
            S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140363950032640 has waited at ibuf0ibuf.cc line 4465 for 240.00 seconds the semaphore:
            S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140392823322368 has waited at lock0lock.cc line 6261 for 241.00 seconds the semaphore:
            Mutex at 0x56329c07c500, Mutex LOCK_SYS created lock0lock.cc:460, lock var 2
             
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140363924854528 has waited at lock0lock.cc line 3389 for 240.00 seconds the semaphore:
            Mutex at 0x56329c07c500, Mutex LOCK_SYS created lock0lock.cc:460, lock var 2
             
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140363941639936 has waited at ibuf0ibuf.cc line 4465 for 240.00 seconds the semaphore:
            S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140364535383808 has waited at ibuf0ibuf.cc line 2456 for 240.00 seconds the semaphore:
            S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140363815749376 has waited at buf0flu.cc line 1231 for 240.00 seconds the semaphore:
            SX-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            2019-09-24  1:15:07 0 [Note] InnoDB: A semaphore wait:
            --Thread 140364409558784 has waited at ibuf0ibuf.cc line 4465 for 238.00 seconds the semaphore:
            S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609
            a writer (thread id 140363924854528) has reserved it in mode  exclusive
            number of readers 0, waiters flag 1, lock_word: 0
            Last time write locked in file ibuf0ibuf.cc line 4465
            InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
            InnoDB: Pending reads 0, writes 0
            

            gedia George Diamantopoulos added a comment - Hello, I've had something similar happen after an upgrade from 10.3.16 -> 10.3.18. The first galera node was brought up without problems, but the remaining two failed to receive IST while the logs were inundated with these "semaphore" warnings. In the end, I had to kill the server and start it again several times for the IST to complete and re-join the cluster. I'm posting excerpts from the log here in case it is of any value. Do you think it is the same issue? Or should open a new one? Thanks! 2019-09-24 1:11:06 2 [Note] WSREP: Receiving IST: 9009 writesets, seqnos 4727456747-4727465756 2019-09-24 1:11:06 0 [Note] WSREP: Receiving IST... 0.0% ( 0/9009 events) complete. 2019-09-24 1:11:06 2 [Warning] InnoDB: Cannot add field `as_contact` in table `kamailio`.`topos_t` because after adding it, the row size is 8193 which is greater than maximum allowed size (8126) for a record on index leaf page. 2019-09-24 1:11:06 2 [Warning] InnoDB: Cannot add field `s_rr` in table `kamailio`.`topos_d` because after adding it, the row size is 8166 which is greater than maximum allowed size (8126) for a record on index leaf page. 2019-09-24 1:11:07 0 [Note] WSREP: (086791d9, 'tcp://172.30.20.129:4567') turning message relay requesting off   ===================================== 2019-09-24 01:11:26 0x7fa92b456700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 20 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 1 srv_active, 0 srv_shutdown, 0 srv_idle srv_master_thread log flush and writes: 0 ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 210 --Thread 140363891283712 has waited at ibuf0ibuf.cc line 4465 for 19.00 seconds the semaphore: S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 --Thread 140363899676416 has waited at ibuf0ibuf.cc line 4465 for 19.00 seconds the semaphore: S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 Similar logs follow after every INNODB MONITOR OUTPUT. After a while, I start getting these in between InnoDB Monitor messages: 2019-09-24 1:15:07 0 [Warning] InnoDB: A long semaphore wait: --Thread 140392823322368 has waited at lock0lock.cc line 6261 for 241.00 seconds the semaphore: Mutex at 0x56329c07c500, Mutex LOCK_SYS created lock0lock.cc:460, lock var 2   2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140363891283712 has waited at ibuf0ibuf.cc line 4465 for 240.00 seconds the semaphore: S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140363899676416 has waited at ibuf0ibuf.cc line 4465 for 240.00 seconds the semaphore: S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140363950032640 has waited at ibuf0ibuf.cc line 4465 for 240.00 seconds the semaphore: S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140392823322368 has waited at lock0lock.cc line 6261 for 241.00 seconds the semaphore: Mutex at 0x56329c07c500, Mutex LOCK_SYS created lock0lock.cc:460, lock var 2   2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140363924854528 has waited at lock0lock.cc line 3389 for 240.00 seconds the semaphore: Mutex at 0x56329c07c500, Mutex LOCK_SYS created lock0lock.cc:460, lock var 2   2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140363941639936 has waited at ibuf0ibuf.cc line 4465 for 240.00 seconds the semaphore: S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140364535383808 has waited at ibuf0ibuf.cc line 2456 for 240.00 seconds the semaphore: S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140363815749376 has waited at buf0flu.cc line 1231 for 240.00 seconds the semaphore: SX-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 2019-09-24 1:15:07 0 [Note] InnoDB: A semaphore wait: --Thread 140364409558784 has waited at ibuf0ibuf.cc line 4465 for 238.00 seconds the semaphore: S-lock on RW-latch at 0x7faf63b29508 created in file buf0buf.cc line 1609 a writer (thread id 140363924854528) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file ibuf0ibuf.cc line 4465 InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info: InnoDB: Pending reads 0, writes 0

            It still fails on BB, 10.2.

            stepan.patryshev Stepan Patryshev (Inactive) added a comment - It still fails on BB, 10.2 .

            People

              jplindst Jan Lindström (Inactive)
              jplindst Jan Lindström (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.