[MDEV-32247] mysqld.exe has been unable to stop. Created: 2023-09-26  Updated: 2023-12-04  Resolved: 2023-12-04

Status: Closed
Project: MariaDB Server
Component/s: Platform Windows, Storage Engine - InnoDB
Affects Version/s: 10.5.22
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: kennyliao Assignee: Marko Mäkelä
Resolution: Incomplete Votes: 0
Labels: None

Attachments: PNG File ProcessMonitor.png    
Issue Links:
Relates
relates to MDEV-22577 innodb_fast_shutdown=0 fails to repor... Closed

 Description   

Stop the database instance using mysqladmin shutdown,More than 24 hours without stopping.

Msyqld has been unable to stop and runs until "InnoDB: FTS optimize thread exiting”

2023-09-22 9:16:30 0 [Note] C:\Server\mysql\bin\mysqld.exe (initiated by: sysdba[sysdba] @ [127.0.0.1]): Normal shutdown
2023-09-22 9:16:30 0 [Note] Event Scheduler: Killing the scheduler thread, thread id 2
2023-09-22 9:16:30 0 [Note] Event Scheduler: Waiting for the scheduler thread to reply
2023-09-22 9:16:30 0 [Note] Event Scheduler: Stopped
2023-09-22 9:16:30 0 [Note] Event Scheduler: Purging the queue. 52 events
2023-09-22 9:16:30 0 [Note] InnoDB: FTS optimize thread exiting.

The information seen by ProcessMonitor has been undergoing flushbufferfile and readfile operations



 Comments   
Comment by Daniel Black [ 2023-09-26 ]

What are your config options my_print_defaults --mysqld?

Comment by kennyliao [ 2023-09-26 ]

Daniel Black ,my config options as follows:

C:\Server\mysql\bin>my_print_defaults.exe --defaults-file=.\..\etc\my.ini mysqld
--secure_file_priv=..\mysql
--port=3307
--pid-file=Storage.pid
--basedir=..
--datadir=..\data
--tmpdir=..\tmp
--slave_load_tmpdir=..\tmp
--character-set-server=utf8
--sql-mode=NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
--table_open_cache=10240
--back_log=500
--max_connections=1000
--max_connect_errors=10000
--thread_cache_size=100
--thread_stack=192K
--sysdate-is-now=1
--ft_min_word_len=4
--lower_case_table_names=1
--max_allowed_packet=16M
--max_heap_table_size=64M
--max_length_for_sort_data=8M
--net_buffer_length=65536
--join_buffer_size=8M
--sort_buffer_size=16M
--binlog_cache_size=4M
--query_cache_type=0
--query_cache_size=0
--tmp_table_size=64M
--slave_net_timeout=30
--event_scheduler=ON
--skip-name-resolve=ON
--skip-external-locking=ON
--skip-slave-start=ON
--skip_symbolic_links=ON
--secure_auth=ON
--log-warnings=2
--skip-grant-tables=OFF
--allow-suspicious-udfs=OFF
--local_infile=OFF
--log-bin=Storage
--binlog-format=mixed
--max-binlog-size=512M
--expire-logs-days=7
--log-error=Storage
--server-id=1
--sync_binlog=1000
--slave_skip_errors=1517
--default-storage-engine=InnoDB
--innodb_file_per_table=1
--transaction_isolation=REPEATABLE-READ
--innodb_strict_mode=1
--innodb_buffer_pool_size=2048M
--innodb_data_file_path=ibdata1:200M:autoextend
--innodb_open_files=1000
--innodb_write_io_threads=32
--innodb_read_io_threads=32
--innodb_thread_concurrency=0
--innodb_io_capacity=8000
--innodb_flush_log_at_trx_commit=2
--innodb_log_buffer_size=16M
--innodb_log_file_size=100M
--innodb_adaptive_flushing=1
--innodb_max_dirty_pages_pct=60
--innodb_lock_wait_timeout=15
--innodb_fast_shutdown=0
--innodb_rollback_on_timeout=ON
--innodb_old_blocks_time=1000
--interactive_timeout=28800
--wait_timeout=28800
--innodb_lru_scan_depth=8096
--log_bin_trust_function_creators=1
--innodb-buffer-pool-load-at-startup=0
--innodb-buffer-pool-load-now=0
--innodb-buffer-pool-dump-at-shutdown=0
--innodb-buffer-pool-load-abort=1
--innodb-buffer-pool-instances=2
--innodb-page-cleaners=16
--innodb-purge-threads=8
--innodb-checksum-algorithm=CRC32
--optimizer_switch=index_merge=off

Comment by Vladislav Vaintroub [ 2023-09-26 ]

kennyliao, it would be good to get a stacktrace corresponding to one of the mentioned events in the procmon.
You can get it e.g with right click on the ReadFile event, chose "Stack". You might also need to configure symbols (Menu Options/Configure Symbols, add C:\Server\mysql\bin to ' Symbol Path')

Otherwise my guess is that this very slow shutdown comes from non-default "innodb-fast-shutdown=0" in your options, and by long time spent merging change buffer. I noticed buffer pool is tiny by todays standards, and this can influence that.

If this is the case, maybe one thing Innodb does wrong is being silent for a very long time during shutdown. marko, what do you think? Maybe it can tell "merging change buffer" at least?

Comment by Marko Mäkelä [ 2023-09-26 ]

There used to be an innodb_fast_shutdown=0 shutdown hang when the change buffer is corrupted. It was supposed to be fixed by MDEV-30009, already in 10.5.19. We’d need stack traces to diagnose the cause of this hang.

Comment by Vladislav Vaintroub [ 2023-09-26 ]

I understood it is "doing something", from kennyliao description, so maybe it is not a hang. Whether slow shutdown can take 24+ hours, I do not know, never seen, but if it is that slow, it is worth an Innodb-side announcement. Do not know if "FTS optimize thread exiting" could take all this time

Comment by kennyliao [ 2023-09-27 ]

I tried to kill the mysqld process later, but the mysqld.exe started and then stopped, but it still couldn't stop

Comment by Marko Mäkelä [ 2023-09-27 ]

We really need some stack traces. The file accesses mentioned in ProcessMonitor.png could be either due to change buffer merge or due to purging the history of committed transactions. If innodb_undo_tablespaces were used, then the purge of history should not access the system tablespace, but the change buffer merge would.

Comment by Marko Mäkelä [ 2023-09-27 ]

There already was a bug fix related to this:
MDEV-22577 innodb_fast_shutdown=0 fails to report progress

Specific to 10.6 (not sure if it could affect 10.5), while debugging MDEV-32050 I noticed that there would be only one message at the start of slow shutdown, mentioning the number of rows of history that needs to be purged. After my test workload, purging the history would take 30 to 90 seconds.

We still need some stack traces to understand what exactly is going on here.

Comment by Marko Mäkelä [ 2023-09-27 ]

For the record, with the bench.sh inside adaptive_purge.tar.gz of MDEV-26356, I got a slow shutdown like this:

10.6

Version: '10.6.16-MariaDB'  socket: '/mnt/sbtest/mysqld.sock'  port: 3306  Source distribution
2023-09-27 17:06:47 0 [Note] /dev/shm/10.6g/sql/mariadbd (initiated by: root[root] @ localhost []): Normal shutdown
2023-09-27 17:06:47 0 [Note] InnoDB: FTS optimize thread exiting.
2023-09-27 17:06:47 0 [Note] InnoDB: to purge 26366271 transactions
2023-09-27 17:12:19 0 [Note] InnoDB: Starting shutdown...
2023-09-27 17:12:19 0 [Note] InnoDB: Dumping buffer pool(s) to /mnt/sbtest/ib_buffer_pool
2023-09-27 17:12:19 0 [Note] InnoDB: Buffer pool(s) dump completed at 230927 17:12:19
2023-09-27 17:12:19 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
2023-09-27 17:12:19 0 [Note] InnoDB: Shutdown completed; log sequence number 73559731602; transaction id 173642215
2023-09-27 17:12:19 0 [Note] /dev/shm/10.6g/sql/mariadbd: Shutdown complete

This was with 12GiB of data in a 40GiB buffer pool but a tiny redo log compared to the workload (innodb_log_file_size=10g). As you can see, there was a message about the purge, but no further progress report for almost 6 minutes.

This reported case could also be due to a change buffer merge. At least here we always get a message about the purge of history.

Comment by Marko Mäkelä [ 2023-09-28 ]

The missing "to purge" messages should be a 10.6+ only problem. As far as I can tell, that was caused by the merge of MDEV-31382 from 10.5 to 10.6.

Comment by Vladislav Vaintroub [ 2023-11-04 ]

kennyliao, can you provide some stacktraces, as described in https://jira.mariadb.org/browse/MDEV-32247?focusedCommentId=270421&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-270421?

Comment by Vladislav Vaintroub [ 2023-11-04 ]

Reassigned to marko, I'm not sure how I can help further.

Generated at Thu Feb 08 10:29:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.