Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16586

Dead-lock in MYSQL_BIN_LOG::reset_logs(..) ?

Details

    Description

      I've installed MariaDB 10.3.7 and initialized with my own configs (from previous setups), set a root user.. etc, then I have executed

      /usr/bin/mysql -NAB -u'root' -p'!password$$' -e "RESET MASTER"

      And this command seems to be stuck since hours (actually it is almost half a day since now...).

      I have attached my mysqld config, logs and the backtrace of mysqld too, it seems to be waiting for a mutex ... possibly dead-lock internally?

      Actually there are two issues here:
      1) Server gets stuck in executing 'RESET MASTER'
      ( The GDB backtrace suggests it is waiting for a mutex, possible dead-lock? )
      2) Client never times out? how is that?
      ( I still have this running since yesterday ~15:16 or something like that... )

      Unfortunetely mariadb doesn't provide -dbg packages (or i just don't see them) so i couldn't provide better backtrace atm...

      Attachments

        1. backtrace.txt
          22 kB
          David Kedves
        2. bt.txt
          40 kB
          David Kedves
        3. my.cnf
          2 kB
          David Kedves
        4. my.cnf.txt
          2 kB
          David Kedves
        5. mysqld.log
          8 kB
          David Kedves
        6. packages.txt
          2 kB
          David Kedves
        7. processlist.txt
          0.6 kB
          David Kedves
        8. processlist.txt
          1 kB
          David Kedves
        9. psaux.txt
          9 kB
          David Kedves
        10. pstree.txt
          2 kB
          David Kedves
        11. reproduced_again.txt
          23 kB
          David Kedves
        12. rpmqa.txt
          9 kB
          David Kedves

        Issue Links

          Activity

            kedazo David Kedves added a comment -

            Hi,

            bt.txt my.cnf.txt processlist.txt psaux.txt pstree.txt rpmqa.txt

            It just happened to me again, this time on CentOS6 (on a qemu cloud image). It didn't "freeze" now on reset logs but on a create user command after that, however in processlist i still see the "reset master" in initializing state.. and i guess this makes the "create user" to be deadlocked somehow when mysqld tries to write into the binlogs... (just guessing..) anyhow, this time i could capture a backtrace with line numbers and everything, please see the new set of attachment to this comment.

            Thanks, David

            kedazo David Kedves added a comment - Hi, bt.txt my.cnf.txt processlist.txt psaux.txt pstree.txt rpmqa.txt It just happened to me again, this time on CentOS6 (on a qemu cloud image). It didn't "freeze" now on reset logs but on a create user command after that, however in processlist i still see the "reset master" in initializing state.. and i guess this makes the "create user" to be deadlocked somehow when mysqld tries to write into the binlogs... (just guessing..) anyhow, this time i could capture a backtrace with line numbers and everything, please see the new set of attachment to this comment. Thanks, David

            Hi kedazo

            Thanks for providing backtrace , Can you provide mysqld.log file also ?, In your earlier mysqld.log file

            2018-06-25 15:20:03 73 [Note] Semi-sync replication initialized for transactions.
            2018-06-25 15:20:03 73 [Note] Semi-sync replication enabled on the master.
            2018-06-25 15:20:03 0 [Note] Starting ack receiver thread
            2018-06-25 15:20:06 79 [Note] Semi-sync replication switched OFF.
            2018-06-25 15:20:06 79 [Note] Semi-sync replication disabled on the master.
            
            

            So is this server master ?, or there is no replication involved at all ?

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi kedazo Thanks for providing backtrace , Can you provide mysqld.log file also ?, In your earlier mysqld.log file 2018-06-25 15:20:03 73 [Note] Semi-sync replication initialized for transactions. 2018-06-25 15:20:03 73 [Note] Semi-sync replication enabled on the master. 2018-06-25 15:20:03 0 [Note] Starting ack receiver thread 2018-06-25 15:20:06 79 [Note] Semi-sync replication switched OFF. 2018-06-25 15:20:06 79 [Note] Semi-sync replication disabled on the master. So is this server master ?, or there is no replication involved at all ?
            kedazo David Kedves added a comment -

            Sorry I have destroyed this VM already... well it is a single instance, so that is right, there is no replication involved at all.
            I will go and reproduce this tomorrow (it is easily reproducible with this my.cnf, it happens always), I will attach the log file then.

            kedazo David Kedves added a comment - Sorry I have destroyed this VM already... well it is a single instance, so that is right, there is no replication involved at all. I will go and reproduce this tomorrow (it is easily reproducible with this my.cnf, it happens always), I will attach the log file then.

            Hi kedazo,

            Thanks ,
            I have 3 more questions
            1. does server have some data or it was completely new instance
            2. If you have some data then , then i think you are , issuing reset master after loading of data.
            3. is it possible for you to upload the data on mariadb private server. So that i can simulate it?
            https://mariadb.com/kb/en/meta/mariadb-ftp-server/

            sachin.setiya.007 Sachin Setiya (Inactive) added a comment - Hi kedazo , Thanks , I have 3 more questions 1. does server have some data or it was completely new instance 2. If you have some data then , then i think you are , issuing reset master after loading of data. 3. is it possible for you to upload the data on mariadb private server. So that i can simulate it? https://mariadb.com/kb/en/meta/mariadb-ftp-server/
            Elkin Andrei Elkin added a comment -

            The latest bt attachment https://jira.mariadb.org/secure/attachment/45808/backtrace.txt (thanks, kedazo) contains symbols and suggests Thread 4 "RESET MASTER" is waiting for the end of binlog checkpoint request while holing a mutex lock which Thread 2 is waiting for. It's no a deadlock between the two. And it's unclear about the reason of binlog checkpoint response delay or loss. There's no other thread that keeps a running transaction that might be blamed for the delay.

            Overall, I think we need some more specific instructions to how to reproduce.

            Elkin Andrei Elkin added a comment - The latest bt attachment https://jira.mariadb.org/secure/attachment/45808/backtrace.txt (thanks, kedazo ) contains symbols and suggests Thread 4 "RESET MASTER" is waiting for the end of binlog checkpoint request while holing a mutex lock which Thread 2 is waiting for. It's no a deadlock between the two. And it's unclear about the reason of binlog checkpoint response delay or loss. There's no other thread that keeps a running transaction that might be blamed for the delay. Overall, I think we need some more specific instructions to how to reproduce.

            People

              sachin.setiya.007 Sachin Setiya (Inactive)
              kedazo David Kedves
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.