[MDEV-28859] MariaDB Assert Crash Using mysqlbackup Created: 2022-06-15 Updated: 2023-10-23 Resolved: 2023-10-23 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.8.3 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Lee Thompson | Assignee: | Daniel Black |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Running in Docker |
||
| Issue Links: |
|
||||||||||||
| Description |
|
My main MariaDB container has started crashing, sometimes, during nightly backups. It does not always crash which makes me think it's not a data issue. When it does crash, it is not at the same point. The backup script (bash) has not changed in 2 years and this just started recently, I'm not sure why. After this crash occurs the server responds with "Too many connections". The part of the script running the backup issues this command:
MariaDB Log (minus the bug reporting advice)
|
| Comments |
| Comment by Marko Mäkelä [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
The crash is at the start of the following function:
Can you please try to find out the value of cb->m_err? danblack should be able to assist you with enabling and analyzing core dumps in a Docker environment. Which Linux kernel version are you using? It could play a role here. | ||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
A work-around would be to set innodb_use_native_aio=0 in the configuration. If the name "jammy" refers to Ubuntu 22.04, I think that the native AIO implementation should be liburing ( | ||||||||||||||||||||||||||||||
| Comment by Lee Thompson [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
I have no idea how to find out the cb->m_err value. I'll need step by step instructions. The container is running (uname -a)
The host OS is (uname -a)
| ||||||||||||||||||||||||||||||
| Comment by Lee Thompson [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
Since upgrading to 10.8.x it already is falling back to innodb_use_native_aio=0
| ||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
leethompson, I can’t give step-by-step instructions for Docker, and I suspect that https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ would not work out of the box. Usually (but not always) in core dumps, the crashing thread is Thread 1. Once you have identified the crashing thread in the output of thread apply all backtrace, you would have to use a command like thread 1 to switch to that thread, and then something like frame 4 (I am not sure about the number) to get to the function io_callback, and then print cb->m_err to display the value. You would likely need a separate debug symbol package installed for the last step to work. Are there any messages about file system corruption or other trouble in the system logs or in the kernel message buffer (sudo dmesg)? Does smartctl report any storage errors? Which file system and type of storage are you using? I was under the wrong impression that io_callback() would not be invoked by the "simulated AIO" implementation. So, there is no work-around for this at the moment. Perhaps wlad has some ideas about this, since the code was last refactored by him in | ||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
Side note: Native AIO should be much more efficient than the fallback implementation. The liburing interface is rather recent; libaio was introduced some time during Linux 2.6 already. Because a given MariaDB Server executable will not support both implementations, it should be better to use an executable that was built for libaio. But, this should not solve the problem at hand. | ||||||||||||||||||||||||||||||
| Comment by Lee Thompson [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
@Marko Mäkelä, most of that went over my head, fortunately the container seem to have apt-get though so I may be able to get that working. I hope. It's getting late here so I'll try it tomorrow. The filesystem (on the host) is btfrs but it's complicated, it's a hybrid RAID 6 array (Synology Hybrid Raid 2) (the box is a Synology DS1817+). There are no errors. MariaDB's data is on the host file system through the volume mounting so it is not in a docker volume. Moreover, I've been trying to alter my backup script to use mariabackup and it works fine so I'm pretty sure (99%?) that this isn't a file system issue. | ||||||||||||||||||||||||||||||
| Comment by Lee Thompson [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
@Marko Mäkelä, changing kernel or mariadb binary is not likely. I'm just using mariadb:latest and whatever it's got. Synology's newer model and operating system is running 4.4.180+ which wouldn't help either. (*nix is not my forte but you've probably guessed that by now.) | ||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
You are right that a < 5.1 kernel won't have uring, so its revered to a simulated AIO. Instead of mariadb:latest, can you run the container quay.io/mariadb-foundation/mariadb-debug:10.8 (same interface with --cap-add CAP_SYS_PTRACE in the docker options (might need CAP_ removed) when starting the container. Before doing the backup run the following and leave it running:
"c" is continue the execution. Run the backup, and the gdb should be stalled at this location with as the assertion happened. (gdb) thread apply all bt -frame-arguments all full and capture this information and include here. Go: until you are in the io_callback function. will show the contents of this including the m_err value that marko and would like to see along with the function that was in progress. | ||||||||||||||||||||||||||||||
| Comment by Lee Thompson [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
Ended up staying up for something else and took a stab at trying to get debug symbols in the container. The mariadb:latest is stripped so I followed the instructions but it ended in failure. Suggestion for MariaDB: Make debug images. If I could swap out with a mariadb:debug_latest image, it would make this a lot easier for all of us. Especially those of us on systems where building a custom container is not much of an option.
This failed in the container for two reasons. sudo is not there. add-apt-repository is not there.
| ||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
leethompson, MariaDB supplies packages for many operating systems. I am not familiar with containers, so I do not know if this is relevant or applicable, but: If there is a Docker container of MariaDB based on Ubuntu 20.04 instead of 22.04, that one should use libaio instead of liburing. Since btrfs was mentioned, this might be related to | ||||||||||||||||||||||||||||||
| Comment by Lee Thompson [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
@Daniel Black, I will do that. I don't want my stuff to be down for a long time so what I'll do is setup the debug container with a clone of the data from the main one and work on getting you the information you need. | ||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
The server debug symbol package does exist - https://archive.mariadb.org/mariadb-10.8.3/repo/ubuntu/pool/main/m/mariadb-10.8/mariadb-server-core-10.8-dbgsym_10.8.3%2Bmaria~jammy_amd64.ddeb , its just missing in the repo information somehow. Downloading and installing with dpkg -i should work. marko containers are only the userspace and not the kernel, hence mariadbd: io_uring_queue_init() failed with ENOSYS, because the kernel interface. If you want to build your own focal (20.04) based container for 10.8.3 - https://github.com/grooverdan/mariadb/tree/focal_images/10.8-focal. | ||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2022-06-16 ] | ||||||||||||||||||||||||||||||
|
add-apt-repository if you use 10.8 instead of 10.5 and jammy instead of focal this should work as a repository directly. | ||||||||||||||||||||||||||||||
| Comment by Lee Thompson [ 2022-06-21 ] | ||||||||||||||||||||||||||||||
|
@Daniel Black, Having trouble with the container building, the Synology Diskstation DS1817+ has it's own Docker UI which is somewhat limited. I've now got Portainer working and should be able to use that. I have never built my own container image myself so this may take some time. | ||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2023-09-19 ] | ||||||||||||||||||||||||||||||
|
with Is this still an issue? |