[MDEV-21215] Random InnoDB: fsync() returned 5 using Btrfs with 10.3.17 Created: 2019-12-04 Updated: 2020-03-16 Resolved: 2020-01-12 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Platform Debian, Storage Engine - InnoDB |
| Affects Version/s: | 10.3.17 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Laszlo Laci | Assignee: | Unassigned |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Debian 10 64 bit Btrfs |
||
| Issue Links: |
|
||||||||||||
| Description |
|
We upgraded Debian 9 (MariaDB 10.1.38) to Debian 10 (MariaDB 10.3.17). We experienced huge slowdowns (perhaps related to: https://jira.mariadb.org/browse/MDEV-16333) We tried to speed up with these settings: innodb_flush_method = O_DIRECT_NO_FSYNC But randomly it produces these errors: 2019-12-04 0:24:31 12111440 [ERROR] [FATAL] InnoDB: fsync() returned 5 To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help Server version: 10.3.17-MariaDB-0+deb10u1-log |
| Comments |
| Comment by Marko Mäkelä [ 2019-12-13 ] | ||
|
We got a similar report in The error code 5 should be "Input/Output errror". laci, do you see any messages about file system corruption or block device errors in the output of the following commands?
Also, if applicable, I would recommend to check sudo smartctl -A /dev/sda (assuming that the file system of the InnoDB data directory is located on that device). | ||
| Comment by Laszlo Laci [ 2019-12-16 ] | ||
|
It's a Xen VM and we see same problems with other VMs too. The VMs runs on different dedicated servers, none of them has disk errors. | ||
| Comment by Marko Mäkelä [ 2019-12-19 ] | ||
|
laci, given that hardware failure has been ruled out, I would primarily point the finger to the file system (btrfs). A quick search returned a Linux kernel fix for something in the fsync() on btrfs. It might not exactly match what you are seeing, because it mentions an assertion failure. If those assertions are not enabled in normal kernel builds, under that scenario you might observe fsync() returning EIO instead. I wonder if a different innodb_flush_method could work around it. As far as I know, we do not use btrfs in internal testing. I do not remember the fsync() call ever failing in our internal tests. | ||
| Comment by Laszlo Laci [ 2019-12-20 ] | ||
|
Thank you very much. The Debian buster-backport's kernel 5.3.9-2~bpo10+1 contains that patch. I will install it and let it run for 2-3 weeks. I hope it fixes this problem. Early next year, I'll let you know if the issue has been resolved with that kernel. Which filesystems do you use in MariaDB internal testing? | ||
| Comment by Marko Mäkelä [ 2020-01-03 ] | ||
|
laci, did the kernel upgrade help? | ||
| Comment by Laszlo Laci [ 2020-01-06 ] | ||
|
The backport kernel seems to have fixed the bug, it hasn't come up since. |