Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-1045

Defunct processes after maxscale have executed script during failover

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Fixed
    • 1.4.3, 2.0.2
    • 2.0.3
    • Core, mmmon
    • reproduced on Ubuntu Xenial with Master - Master.
      reproduced with MariaDB 10.1 and MySQL 5.6 Master <-> Master

    Description

      Steps to reproduce:

      Create a Master <-> Master replication.

      With docker:

      docker run \
      --name master101 \
      -d \
      -p 32810:3306 \
      -e MYSQL_ROOT_PASSWORD=maria2016 \
      mariadb:10.1 \
      --server-id=7 \
      --log-bin

      docker run \
      --name master102 \
      -d \
      -p 32811:3306 \
      -e MYSQL_ROOT_PASSWORD=maria2016 \
      mariadb:10.1 \
      --server-id=8 \
      --log-bin

      Execute on both server
      GRANT REPLICATION SLAVE ON . TO 'repl'@'%'
      IDENTIFIED BY 'slave2016';

      Execute on master101
      show master status;CHANGE MASTER TO
      MASTER_HOST='127.0.0.1',
      MASTER_PORT='32821'
      MASTER_USER='repl',
      MASTER_PASSWORD='slave2016';

      Execute on master102
      show master status;CHANGE MASTER TO
      MASTER_HOST='127.0.0.1',
      MASTER_PORT='32820'
      MASTER_USER='repl',
      MASTER_PASSWORD='slave2016';
      Execute both server:
      start slave;
      CREATE USER 'maxscale'@'%' IDENTIFIED BY 'maxscale';
      GRANT EXECUTE, PROCESS, SELECT, SHOW DATABASES, SHOW VIEW, ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TABLESPACE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, EVENT, INDEX, INSERT, REFERENCES, TRIGGER, UPDATE, CREATE USER, FILE, LOCK TABLES, RELOAD, REPLICATION CLIENT, REPLICATION SLAVE, SHUTDOWN, SUPER ON . TO 'mm'@'%';

      Install Maxscale 2.0.2 with attach configuration file (copy to /etc)

      copy test.sh to /var/lib/maxscale

      execute
      sudo service maxscale start;

      docker stop master101;

      result:

      root 21165 0.1 0.0 332652 8736 ? Ssl 12:32 0:06 /usr/bin/maxscale -d
      maxscale 23707 0.1 0.0 287900 9304 ? Ssl 14:05 0:00 /usr/bin/maxscale --user=maxscale
      maxscale 23741 0.0 0.0 0 0 ? Z 14:09 0:00 [test.sh] <defunct>

      Output from failover.log from test.sh

      2016.12.01 14:09:21 : --event 'master_down' --initiator '127.0.0.1:32810' --nodelist '127.0.0.1:32811' –
      ###############

      maxscale log with info and debug enabled:

      2016-12-01 14:13:43 error : Monitor was unable to connect to server 127.0.0.1:32810 : "Lost connection to MySQL server at 'handshake: reading inital communication packet', system error: 115"
      2016-12-01 14:13:43 debug : Backend server 127.0.0.1:32810 state : DOWN
      2016-12-01 14:13:43 notice : Server changed state: server1[127.0.0.1:32810]: master_down. [Master, Running] -> [Down]
      2016-12-01 14:13:43 debug : [monitor_exec_cmd] Forked child process 23938 : /var/lib/maxscale/test.sh.
      2016-12-01 14:13:43 notice : Executed monitor script '/var/lib/maxscale/test.sh --event=$EVENT --initiator=$INITIATOR --nodelist=$NODELIST' on event 'master_down'.
      2016-12-01 14:13:43 debug : 139833263933184 [dcb_hangup_foreach]

      Richard Stracke

      Attachments

        1. maxscale.cnf
          0.8 kB
        2. test.sh
          0.3 kB

        Activity

          One additional comment.

          The defunct processes vanishes after sudo service maxscale stop or restart

          Richard

          Richard Richard Stracke added a comment - One additional comment. The defunct processes vanishes after sudo service maxscale stop or restart Richard
          markus makela markus makela added a comment -

          I tested this quickly on Fedora 25 with the exact script and both the mysqlmon and mmmon modules but I was unable to reproduce it. I'll continue the investigation on Ubuntu Xenial.

          markus makela markus makela added a comment - I tested this quickly on Fedora 25 with the exact script and both the mysqlmon and mmmon modules but I was unable to reproduce it. I'll continue the investigation on Ubuntu Xenial.
          markus makela markus makela added a comment -

          I've managed to reproduce it and it seems to happen even with 2.0.2.

          markus makela markus makela added a comment - I've managed to reproduce it and it seems to happen even with 2.0.2.
          markus makela markus makela added a comment -

          For some reason, the processes aren't sending the SIGCHLD signal to the parent process.

          2016-12-05 13:42:29   notice : [monitor_exec_cmd] Forked child process 17504 : /var/lib/maxscale/test.sh.
          2016-12-05 13:42:29   notice : Executed monitor script '/var/lib/maxscale/test.sh --event=$EVENT --initiator=$INITIATOR --nodelist=$NODELIST' on event 'master_down'.
          

          Normally, we'd see a log message about a SIGCHLD handler being called.

          markus makela markus makela added a comment - For some reason, the processes aren't sending the SIGCHLD signal to the parent process. 2016-12-05 13:42:29 notice : [monitor_exec_cmd] Forked child process 17504 : /var/lib/maxscale/test.sh. 2016-12-05 13:42:29 notice : Executed monitor script '/var/lib/maxscale/test.sh --event=$EVENT --initiator=$INITIATOR --nodelist=$NODELIST' on event 'master_down'. Normally, we'd see a log message about a SIGCHLD handler being called.
          markus makela markus makela added a comment - - edited

          This seems to be caused by the fact that the SIGCHLD signal is not deleted from the original processes signal list. Removing the SIGCHLD handler for the parent process of the daemon process seems to fix this.

          This also never happens when MaxScale is run directly from the terminal with the -d flag.

          markus makela markus makela added a comment - - edited This seems to be caused by the fact that the SIGCHLD signal is not deleted from the original processes signal list. Removing the SIGCHLD handler for the parent process of the daemon process seems to fix this. This also never happens when MaxScale is run directly from the terminal with the -d flag.
          markus makela markus makela added a comment -

          The child process signals were ignored by the daemon process. Deleting the signal from the original parent's signal list fixes this.

          markus makela markus makela added a comment - The child process signals were ignored by the daemon process. Deleting the signal from the original parent's signal list fixes this.

          People

            markus makela markus makela
            Richard Richard Stracke
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.