[MDEV-11716] Online DDL Hangs Created: 2017-01-03  Updated: 2019-12-12  Resolved: 2019-12-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - XtraDB
Affects Version/s: 10.1.17
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Brad Jorgensen Assignee: Jan Lindström (Inactive)
Resolution: Incomplete Votes: 0
Labels: galera, online-ddl, xtradb

Attachments: File my.cnf    

 Description   

I recently was performing an optimize on some InnoDB tables and experienced an issue that causes the process to hang without intervention. I was running the optimize on my 3 galera nodes while rolling through them with wsrep_osu_method set to RSU to not affect the running cluster. The Innodb_checkpoint_age keeps increasing until it gets to a maximum level that is about 90% of the Innodb_checkpoint_max_age at which point the process stops advancing (MySQL disk IO stops). I have to set innodb_idle_flush_pct to a non-zero value to get the checkpoint age down and allow the process to continue. The statistics I have of the last time I ran into this issue are too low resolution to be useful; I'll save everything useful that I can the next time I do an online DDL and update the issue, but hopefully this information is enough to start looking into the issue. The issue is really only a problem for larger tables (multiple GB) since the redo log has enough space for smaller tables.

I have tried adjusting the following options up and down to extremes, but only innodb_idle_flush_pct helps:
innodb_adaptive_flushing
innodb_adaptive_flushing_lwm
innodb_max_dirty_pages_pct
innodb_max_dirty_pages_pct_lwm
innodb_idle_flush_pct

I have attached my lightly redacted server config file.



 Comments   
Comment by Jan Lindström (Inactive) [ 2019-12-12 ]

No real data to work on. We would need at least some step by step instructions how to repeat.

Generated at Thu Feb 08 07:52:07 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.