[MDEV-20391] Add information about the progress of transaction rollback (and waiting for replication threads to complete) to the error log upon shutdown - Jira

XML

Word

Printable

Details

Type: Task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Fix Version/s: 10.4(EOL)
Component/s: Replication, Storage Engine - InnoDB
Labels:

Description

Currently when shutdown happens, innodb_fast_shutdown<>2 snd InnoDB has to roll back some huge transaction, there is no output (but one empty row) until this process is completed. Consider this error log sample:

2019-08-19 16:08:58 139669736232704 [Note] /usr/sbin/mysqld (initiated by: unknown): Normal shutdown

2019-08-19 16:08:58 139669859714816 [Note] InnoDB: FTS optimize thread exiting.

2019-08-19 16:08:58 139669736232704 [Note] Event Scheduler: Purging the queue. 0 events

2019-08-19 16:08:58 139669751666432 [Note] Error reading relay log event: slave SQL thread was killed

2019-08-19 16:08:58 139669770405632 [Note] Slave I/O thread exiting, read up to log 'binlog.1519705', position 10122889; GTID position 0-198067036-22956651797,999-186069024-2

2019-08-19 16:08:59 139669751666432 [Note] Slave SQL thread exiting, replication stopped in log 'binlog.1519705' at position 10122889; GTID position '0-198067036-22956651797,999-186069024-2'

2019-08-19 16:09:19 139669736232704 [Warning] /usr/sbin/mysqld: Forcing close of thread 4999326 user: 'user1'

2019-08-19 16:09:19 139669736232704 [Warning] /usr/sbin/mysqld: Forcing close of thread 4999326 user: ‘user1’

2019-08-19 17:22:46 139669736232704 [Note] unregister_replicator OK

2019-08-19 17:22:46 139669736232704 [Note] InnoDB: Starting shutdown...

2019-08-19 17:22:46 139669817751296 [Note] InnoDB: Dumping buffer pool(s) to /mysql/ppc/data/ib_buffer_pool

As you can see, we get nothing but one empty row for more than an hour. Backtrace analysis shows that there is only one active thread doing rollback of a huge transaction, but there is no way to find this out without OS level tools like gdb, perf or bpftrace.

How poor DBA should know that something is in progress and there is no hang? Please, add information about the transaction(s) rollback stage (maybe if they work for longer than few seconds at least) like it was done for other stages of normal shutdown above, and consider adding progress reports similar to those that are written upon startup and recovery.

On a separate note, in versions with ~~MDEV-18450~~ implemented it would be useful to add the output about waiting for replication-related dump threads to be stopped properly. I may be wrong, but cursory review of the related commits does not seem to give me a confidence that there is a proper reporting of the progress on that stage in non-debug binaries.

Attachments

Activity

People

Assignee:: Andrei Elkin

Reporter:: Valerii Kravchuk

Votes:: 1 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2019-08-20 10:10

Updated:: 2024-07-07 22:35

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.