Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20391

Add information about the progress of transaction rollback (and waiting for replication threads to complete) to the error log upon shutdown

    Details

      Description

      Currently when shutdown happens, innodb_fast_shutdown<>2 snd InnoDB has to roll back some huge transaction, there is no output (but one empty row) until this process is completed. Consider this error log sample:

      2019-08-19 16:08:58 139669736232704 [Note] /usr/sbin/mysqld (initiated by: unknown): Normal shutdown
      2019-08-19 16:08:58 139669859714816 [Note] InnoDB: FTS optimize thread exiting.
      2019-08-19 16:08:58 139669736232704 [Note] Event Scheduler: Purging the queue. 0 events
      2019-08-19 16:08:58 139669751666432 [Note] Error reading relay log event: slave SQL thread was killed
      2019-08-19 16:08:58 139669770405632 [Note] Slave I/O thread exiting, read up to log 'binlog.1519705', position 10122889; GTID position 0-198067036-22956651797,999-186069024-2
      2019-08-19 16:08:59 139669751666432 [Note] Slave SQL thread exiting, replication stopped in log 'binlog.1519705' at position 10122889; GTID position '0-198067036-22956651797,999-186069024-2'
      2019-08-19 16:09:19 139669736232704 [Warning] /usr/sbin/mysqld: Forcing close of thread 4999326 user: 'user1'
      2019-08-19 16:09:19 139669736232704 [Warning] /usr/sbin/mysqld: Forcing close of thread 4999326 user: ‘user1’
       
      2019-08-19 17:22:46 139669736232704 [Note] unregister_replicator OK
      2019-08-19 17:22:46 139669736232704 [Note] InnoDB: Starting shutdown...
      2019-08-19 17:22:46 139669817751296 [Note] InnoDB: Dumping buffer pool(s) to /mysql/ppc/data/ib_buffer_pool
      

      As you can see, we get nothing but one empty row for more than an hour. Backtrace analysis shows that there is only one active thread doing rollback of a huge transaction, but there is no way to find this out without OS level tools like gdb, perf or bpftrace.

      How poor DBA should know that something is in progress and there is no hang? Please, add information about the transaction(s) rollback stage (maybe if they work for longer than few seconds at least) like it was done for other stages of normal shutdown above, and consider adding progress reports similar to those that are written upon startup and recovery.

      On a separate note, in versions with MDEV-18450 implemented it would be useful to add the output about waiting for replication-related dump threads to be stopped properly. I may be wrong, but cursory review of the related commits does not seem to give me a confidence that there is a proper reporting of the progress on that stage in non-debug binaries.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              valerii Valerii Kravchuk
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: