Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9595

Shutdown takes forever with many replication channels

    XMLWordPrintable

    Details

      Description

      Shutting down MariaDB by SIGTERM when it has many (20) replication channels (with parallel replication enabled) seems never-ending. But if I stop all slaves before shutdown, it takes a few seconds. Moreover, mysqld reports errors on I/O threads.

      I'd expect that server stops all slaves itself.

      # sudo systemctl status mariadb -n 64
      ● mariadb.service - MariaDB server
         Loaded: loaded (/nix/store/4hm49xrn8md0swvc3sm9n2qhafwllgc1-unit-mariadb.service/mariadb.service)
         Active: deactivating (stop-sigterm) since Fri 2016-02-19 21:06:29 SGT; 1h 50min ago
        Process: 3189 ExecStartPre=/nix/store/m8hhisn7gag7iycddw8hflila3gi71h3-unit-script/bin/mariadb-pre-start (code=exited, status=0/SUCCESS)
       Main PID: 3640 (mysqld)
         CGroup: /system.slice/mariadb.service
                 └─3640 /nix/store/z7x0ndrckrsmqr3163hp9kgs9hwags61-mariadb-10.1.11/bin/mysqld --defaults-file=/nix/store/8lgvar6h0696ngpsladkvmqwbd5k6x5v-mysqld.cnf
       
      Feb 19 20:04:23 grumpycat mysqld[3640]: 2016-02-19 20:04:23 139653215652608 [Note] Master 'oms-my': Slave SQL thread initialized, starting replication in l...n: 3761334
      Feb 19 21:06:29 grumpycat systemd[1]: Stopping MariaDB server...
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652892388096 [Note] /nix/store/z7x0ndrckrsmqr3163hp9kgs9hwags61-mariadb-10.1.11/bin/mysqld: ...l shutdown
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652892388096 [Note] Event Scheduler: Purging the queue. 0 events
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139653540129536 [Note] Master 'bob-th': Error reading relay log event: slave SQL thread was killed
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652246162176 [ERROR] Master 'bob-th': Error reading packet from server: Lost connection to M...rrno=2013)
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652246162176 [Note] Master 'bob-th': Slave I/O thread exiting, read up to log 'mysql-bin-cha...on 3256007
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139653539826432 [Note] Master 'zops': Error reading relay log event: slave SQL thread was killed
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652894509824 [ERROR] Master 'zops': Error reading packet from server: Lost connection to MyS...rrno=2013)
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652894509824 [Note] Master 'zops': Slave I/O thread exiting, read up to log 'mysql-bin-chang...sition 422
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139653999532800 [Note] Master 'bob-my': Error reading relay log event: slave SQL thread was killed
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652570942208 [ERROR] Master 'bob-my': Error reading packet from server: Lost connection to M...rrno=2013)
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652570942208 [Note] Master 'bob-my': Slave I/O thread exiting, read up to log 'mysql-bin-cha...on 3947621
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139653538310912 [Note] Master 'bob-hk': Error reading relay log event: slave SQL thread was killed
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652570336000 [ERROR] Master 'bob-hk': Error reading packet from server: Lost connection to M...rrno=2013)
      Feb 19 21:06:29 grumpycat mysqld[3640]: 2016-02-19 21:06:29 139652570336000 [Note] Master 'bob-hk': Slave I/O thread exiting, read up to log 'mysql-bin-cha...on 1303108
      Feb 19 21:06:30 grumpycat mysqld[3640]: 2016-02-19 21:06:30 139653217471232 [Note] Master 'bob-ph': Error reading relay log event: slave SQL thread was killed
      Feb 19 21:06:30 grumpycat mysqld[3640]: 2016-02-19 21:06:30 139652892691200 [ERROR] Master 'bob-ph': Error reading packet from server: Lost connection to M...rrno=2013)
      Feb 19 21:06:30 grumpycat mysqld[3640]: 2016-02-19 21:06:30 139652892691200 [Note] Master 'bob-ph': Slave I/O thread exiting, read up to log 'mysql-bin-cha...on 8157090
      ...

      Server has 30G of RAM, and 4 CPUs. iotop shows ~100K/s read and write for mysqld process

      [mysqld]
      basedir = /nix/store/z7x0ndrckrsmqr3163hp9kgs9hwags61-mariadb-10.1.11
      init_file = /nix/store/71malbvn6kkn29yiwrsalnpirf2b6v62-init
      pid_file = /run/mysqld/mysqld.pid
      plugin_load = unix_socket=auth_socket.so
      datadir = /mariadb/db
      ignore_db_dirs = lost+found
      innodb_buffer_pool_instances = 64
      innodb_buffer_pool_size = 19327352832
      innodb_file_format = barracuda
      innodb_file_per_table = ON
      innodb_flush_log_at_trx_commit = 2
      innodb_flush_method = O_DIRECT
      innodb_lock_wait_timeout = 1800
      innodb_log_file_size = 314572800
      join_buffer_size = 1048576
      log_slave_updates = OFF
      max_allowed_packet = 268435456
      max_connections = 1000
      net_read_timeout = 1000
      net_write_timeout = 1000
      port = 3306
      relay_log = /mariadb/relay/cat-bin
      server_id = 112233
      skip_log_bin
      slave_compressed_protocol = ON
      slave_domain_parallel_threads = 10
      slave_net_timeout = 600
      slave_parallel_max_queued = 8388608
      slave_parallel_threads = 40
      sort_buffer_size = 4194304
      ssl_cert = /nix/store/rmlq71y8fnyx1k7sp3qx55fqwg45pl40-cats-cert.pem
      ssl_key = /run/keys/cats-key.pem
      table_open_cache = 30000
      ; replicate_*
      !include /nix/store/czaiav63ddqrz18sizhfy69jvv04nfcn-mysqld-repl.cnf

        Attachments

          Activity

            People

            Assignee:
            knielsen Kristian Nielsen
            Reporter:
            ip1981 Igor Pashev
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration