[MDEV-10843] XtraDB Semaphore Stalls with innodb_use_mtflush enabled - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Minor
Resolution: Won't Fix
Affects Version/s: 10.1.17
Fix Version/s: N/A
Component/s: Storage Engine - XtraDB
Labels:
- innodb
- xtradb
Environment:
CentOS 7 and Gentoo

Description

I have 3 servers in production all recently upgraded to MariaDB 10.1 to, among other things, make use of InnoDB Page Compression.

All 3 use XFS on the data partition, over the top of hardware RAID10 with SSDs.

CentOS boxes are 16 cores/32 threads, 256GB RAM
Gentoo box is 20 cores/40 threads, 256GB RAM

I periodically need to run through and OPTIMIZE several of our InnoDB tables. Some due to heavy delete operations in production, others, if I need to restore from snapshot, tar doesn't preserve the sparseness of the InnoDB files so I run an optimize across them to get the space back.

However, when running a mass optimize, I kept running into hangs, with InnoDB bitching about semaphore locks etc in the error log.

I've since discovered if I disable innodb_use_mtflush then the stall doesn't occur, at all, across the entire DB (there's a few thousand tables using page compression).

I've attached an innodb status from when the semaphore hang occurs.

I've experimented with various values for adaptive hash index partitions, innodb thread concurrency, io read and write threads etc. The only thing that appears to make the problem happen or go away, is whether mtflush is enabled or not.

Apologies if mtflush is ONLY for FusionIO but it isn't readily apparent that you shouldn't, only that it was created to improve performance with FusionIO.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

innodb-status
39 kB
2016-09-20 15:56
mysql-status
97 kB
2016-10-01 07:34
mysql-variables
19 kB
2016-10-01 07:34

Issue Links

relates to

MDEV-12496 mtflush thread's hang cause mysqld crash

Closed

MDEV-12722 Maria DB 10.1.16 freeze

Closed

Activity

Ascending order - Click to sort in descending order

Jan Lindström (Inactive) added a comment - 2016-09-27 05:50

innodb-use-mtflush is not intended to be used only for FusionIO, I have used it also for traditional SSD and HD. Multi-threaded flush is beneficial only when you device is fast enough. I would need to know your configuration file i.e. my.cnf and especially how many threads you allocated for mtflush and innodb_lru_scan_depth, innodb_io_capacity, innodb_io_capacity_max. These should be tuned correctly. Long semaphore wait you are seeing most likely is not about mtflush but something else. Can I also have output from dmesg or similar.

Jan Lindström (Inactive) added a comment - 2016-09-27 05:50 innodb-use-mtflush is not intended to be used only for FusionIO, I have used it also for traditional SSD and HD. Multi-threaded flush is beneficial only when you device is fast enough. I would need to know your configuration file i.e. my.cnf and especially how many threads you allocated for mtflush and innodb_lru_scan_depth, innodb_io_capacity, innodb_io_capacity_max. These should be tuned correctly. Long semaphore wait you are seeing most likely is not about mtflush but something else. Can I also have output from dmesg or similar.

Alex Boag-Munroe added a comment - 2016-10-01 07:34 - edited

Hi Jan

Thanks for the response. Apologies for my delayed response, I haven't been able to spend much time on this this week.

Attached is the output of show global variables and show global status mysql-status mysql-variables

There was nothing pertaining to MySQL in dmesg or syslog at the time of the hang.

Alex Boag-Munroe added a comment - 2016-10-01 07:34 - edited Hi Jan Thanks for the response. Apologies for my delayed response, I haven't been able to spend much time on this this week. Attached is the output of show global variables and show global status mysql-status mysql-variables There was nothing pertaining to MySQL in dmesg or syslog at the time of the hang.

Jan Lindström added a comment - 2023-04-11 07:32

10.1 is EOL.

Jan Lindström added a comment - 2023-04-11 07:32 10.1 is EOL.

People

Assignee:: Jan Lindström (Inactive)

Reporter:: Alex Boag-Munroe

Votes:: 2 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 2016-09-20 15:58

Updated:: 2023-04-12 12:07

Resolved:: 2023-04-11 07:32

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration