[MDBF-321] buildbot to use real filesystems for [at least] storage engine test suites Created: 2022-01-27  Updated: 2023-03-09

Status: Open
Project: MariaDB Foundation Development
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Daniel Black Assignee: Faustin Lammler
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
causes MDEV-29045 document MariaDB server on various fi... Open
Relates
relates to MDEV-23599 btrfs coming to Fedora 33 - soon Octo... Closed

 Description   

A number of issues clearly show that there are parts of the server and the kernel untested by the over utilization of tmpfs as a test directory:

E.g:

The only real solution is to unitize a variety of real filesystems (not just ext4), including btrfs, xfs and zfs. As an extension maybe even cifs, NFS (not uncommon in cloud storage), and virtiofs.

I do realize that real IOP are limited.

potential ideas:
https://www.kernel.org/doc/html/latest/admin-guide/blockdev/ramdisk.html maybe one option.

non space file on /dev/shm
losetup on it
mkfs on the /dev/loop
pass that to vm/container



 Comments   
Comment by Faustin Lammler [ 2022-01-27 ]

vladbogo can you please check if this is related to our RAMFS usage?

Comment by Vlad Bogolin [ 2022-01-27 ]

faust while looking at the MDEVs danblack mentioned, I cannot see any buildbot specific issues (or maybe I missed something). So, I assume the problem is not related to the buildbot usage of RAMFS.

Comment by Daniel Black [ 2022-01-27 ]

Its not that tmpfs is failing on bb, its that there was a number of filesystem issues that buildbot didn't catch (Marko noted only msan doesn't use tmpfs I think). And users don't use tmpfs for their databases.

tmpfs doesn't support O_DIRECT so silently ignores it. This means a bunch of constraints around write alignment aren't tested.
Also MDEV-26970 cifs does support O_DIRECT, however there is some other EINV in the way writes are done. We needed a user to find this case.

Also WSL (maybe pulling a generated container image and doing a very small hammerdb run).
https://github.com/MariaDB/mariadb-docker/issues/403
https://github.com/MariaDB/mariadb-docker/issues/331 / https://jira.mariadb.org/browse/MDEV-24189

Comment by Daniel Black [ 2022-02-21 ]

Also would have enabled early identification of:

MDEV-27772 Performance regression with default configuration in 10.6
MDEV-27900 InnoDB: Database page corruption on btrfs with io_uring
MDEV-27593 InnoDB handle AIO errors with message rather than assertion

Comment by Daniel Black [ 2022-07-20 ]

As filesystems are kernel implemented we need to do the filesystem test on the native kernel for the OS as this is a common user case.

An OS build container running innodb on tmpfs will many of the internals of the Innodb, but its interaction with the filesystem isn't tested in a way the user would use it. InnoDB also is innovative in using system calls like uring and O_DIRECT and these need to be tested on all filesystems and OS.

The main case for containers that run on different OS is rather limited to the Ubuntu focal MariaDB 10.2-10.7 and jammy 10.8+ containers that Docker Library generate that should be able to run on all kernel versions.

Fedora and OpenSuse have make a large commitment to btrfs. Ext4 last I checked was still a common/default option presenting in the OS install for other OSes.

Passing a block device into the VM for testing and mkfs won't take long at all on small partitions. Even randomizing the FS used in the test would provide the opportunity to catch the errors. MTR has a --vardir=/mnt that can perform the innodb tests there.

Aria, not too concerned about as its really boring in the way it accesses the filesystem.

We also need to identify a bleeding edge kernel as finding about corruption once a kernel hits mainstream isn't good for users. I do this on fedora36 easy enough.

Comment by Daniel Black [ 2022-09-23 ]

f2fs - MDEV-29617

Generated at Thu Feb 08 03:37:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.