Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-28859

MariaDB Assert Crash Using mysqlbackup

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Incomplete
    • 10.8.3
    • N/A
    • None
    • Running in Docker

    Description

      My main MariaDB container has started crashing, sometimes, during nightly backups. It does not always crash which makes me think it's not a data issue. When it does crash, it is not at the same point.

      The backup script (bash) has not changed in 2 years and this just started recently, I'm not sure why. After this crash occurs the server responds with "Too many connections".

      The part of the script running the backup issues this command:

      sudo docker exec $CONTAINER_NAME mysqldump --user=$MARIADB_USER --lock-tables --all-databases > $BACKUP_TARGET_HOST
      

      MariaDB Log (minus the bug reporting advice)

      2022-06-15 13:49:12 0x7f956a7fc640  InnoDB: Assertion failure in file ./storage/innobase/os/os0file.cc line 3540
      InnoDB: Failing assertion: cb->m_err == DB_SUCCESS
      220615 13:49:12 [ERROR] mysqld got signal 6 
      Server version: 10.8.3-MariaDB-1:10.8.3+maria~jammy
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=3
      max_threads=153
      thread_count=3
      It is possible that mysqld could use up to 
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467997 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
      Thread pointer: 0x0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x0 thread_stack 0x49000
      mariadbd(my_print_stacktrace+0x32)[0x7f95a4905212]
      mariadbd(handle_fatal_signal+0x478)[0x7f95a43da7e8]
      

      Attachments

        Issue Links

          Activity

            The crash is at the start of the following function:

            static void io_callback(tpool::aiocb *cb)
            {
              ut_a(cb->m_err == DB_SUCCESS);
            

            Can you please try to find out the value of cb->m_err? danblack should be able to assist you with enabling and analyzing core dumps in a Docker environment.

            Which Linux kernel version are you using? It could play a role here.

            marko Marko Mäkelä added a comment - The crash is at the start of the following function: static void io_callback(tpool::aiocb *cb) { ut_a(cb->m_err == DB_SUCCESS); Can you please try to find out the value of cb->m_err ? danblack should be able to assist you with enabling and analyzing core dumps in a Docker environment. Which Linux kernel version are you using? It could play a role here.

            A work-around would be to set innodb_use_native_aio=0 in the configuration.

            If the name "jammy" refers to Ubuntu 22.04, I think that the native AIO implementation should be liburing (MDEV-24883) and not libaio.

            marko Marko Mäkelä added a comment - A work-around would be to set innodb_use_native_aio=0 in the configuration. If the name "jammy" refers to Ubuntu 22.04, I think that the native AIO implementation should be liburing ( MDEV-24883 ) and not libaio .
            leethompson Lee Thompson added a comment -

            I have no idea how to find out the cb->m_err value. I'll need step by step instructions.

            The container is running (uname -a)

            Linux 3.10.105 #25426 SMP Wed Jul 8 03:19:33 CST 2020 x86_64 x86_64 x86_64 GNU/Linux
            

            The host OS is (uname -a)

            Linux 3.10.105 #25426 SMP Wed Jul 8 03:19:33 CST 2020 x86_64 GNU/Linux synology_avoton_1817+
            

            leethompson Lee Thompson added a comment - I have no idea how to find out the cb->m_err value. I'll need step by step instructions. The container is running (uname -a) Linux 3.10.105 #25426 SMP Wed Jul 8 03:19:33 CST 2020 x86_64 x86_64 x86_64 GNU/Linux The host OS is (uname -a) Linux 3.10.105 #25426 SMP Wed Jul 8 03:19:33 CST 2020 x86_64 GNU/Linux synology_avoton_1817+
            leethompson Lee Thompson added a comment -

            Since upgrading to 10.8.x it already is falling back to innodb_use_native_aio=0

            2022-06-15 22:01:07 0 [Warning] mariadbd: io_uring_queue_init() failed with ENOSYS: check seccomp filters, and the kernel version (newer than 5.1 required)
            2022-06-15 22:01:07 0 [Warning] InnoDB: liburing disabled: falling back to innodb_use_native_aio=OFF
            

            leethompson Lee Thompson added a comment - Since upgrading to 10.8.x it already is falling back to innodb_use_native_aio=0 2022-06-15 22:01:07 0 [Warning] mariadbd: io_uring_queue_init() failed with ENOSYS: check seccomp filters, and the kernel version (newer than 5.1 required) 2022-06-15 22:01:07 0 [Warning] InnoDB: liburing disabled: falling back to innodb_use_native_aio=OFF

            leethompson, I can’t give step-by-step instructions for Docker, and I suspect that https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ would not work out of the box. Usually (but not always) in core dumps, the crashing thread is Thread 1. Once you have identified the crashing thread in the output of thread apply all backtrace, you would have to use a command like thread 1 to switch to that thread, and then something like frame 4 (I am not sure about the number) to get to the function io_callback, and then print cb->m_err to display the value. You would likely need a separate debug symbol package installed for the last step to work.

            Are there any messages about file system corruption or other trouble in the system logs or in the kernel message buffer (sudo dmesg)? Does smartctl report any storage errors? Which file system and type of storage are you using?

            I was under the wrong impression that io_callback() would not be invoked by the "simulated AIO" implementation. So, there is no work-around for this at the moment. Perhaps wlad has some ideas about this, since the code was last refactored by him in MDEV-16264.

            marko Marko Mäkelä added a comment - leethompson , I can’t give step-by-step instructions for Docker, and I suspect that https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ would not work out of the box. Usually (but not always) in core dumps, the crashing thread is Thread 1. Once you have identified the crashing thread in the output of thread apply all backtrace , you would have to use a command like thread 1 to switch to that thread, and then something like frame 4 (I am not sure about the number) to get to the function io_callback , and then print cb->m_err to display the value. You would likely need a separate debug symbol package installed for the last step to work. Are there any messages about file system corruption or other trouble in the system logs or in the kernel message buffer ( sudo dmesg )? Does smartctl report any storage errors? Which file system and type of storage are you using? I was under the wrong impression that io_callback() would not be invoked by the "simulated AIO" implementation. So, there is no work-around for this at the moment. Perhaps wlad has some ideas about this, since the code was last refactored by him in MDEV-16264 .

            Side note: Native AIO should be much more efficient than the fallback implementation. The liburing interface is rather recent; libaio was introduced some time during Linux 2.6 already. Because a given MariaDB Server executable will not support both implementations, it should be better to use an executable that was built for libaio. But, this should not solve the problem at hand.

            marko Marko Mäkelä added a comment - Side note: Native AIO should be much more efficient than the fallback implementation. The liburing interface is rather recent; libaio was introduced some time during Linux 2.6 already. Because a given MariaDB Server executable will not support both implementations, it should be better to use an executable that was built for libaio . But, this should not solve the problem at hand.
            leethompson Lee Thompson added a comment -

            @Marko Mäkelä, most of that went over my head, fortunately the container seem to have apt-get though so I may be able to get that working. I hope. It's getting late here so I'll try it tomorrow.

            The filesystem (on the host) is btfrs but it's complicated, it's a hybrid RAID 6 array (Synology Hybrid Raid 2) (the box is a Synology DS1817+). There are no errors. MariaDB's data is on the host file system through the volume mounting so it is not in a docker volume.

            Moreover, I've been trying to alter my backup script to use mariabackup and it works fine so I'm pretty sure (99%?) that this isn't a file system issue.

            leethompson Lee Thompson added a comment - @Marko Mäkelä, most of that went over my head, fortunately the container seem to have apt-get though so I may be able to get that working. I hope. It's getting late here so I'll try it tomorrow. The filesystem (on the host) is btfrs but it's complicated, it's a hybrid RAID 6 array (Synology Hybrid Raid 2) (the box is a Synology DS1817+). There are no errors. MariaDB's data is on the host file system through the volume mounting so it is not in a docker volume. Moreover, I've been trying to alter my backup script to use mariabackup and it works fine so I'm pretty sure (99%?) that this isn't a file system issue.
            leethompson Lee Thompson added a comment - - edited

            @Marko Mäkelä, changing kernel or mariadb binary is not likely. I'm just using mariadb:latest and whatever it's got. Synology's newer model and operating system is running 4.4.180+ which wouldn't help either.

            (*nix is not my forte but you've probably guessed that by now.)

            leethompson Lee Thompson added a comment - - edited @Marko Mäkelä, changing kernel or mariadb binary is not likely. I'm just using mariadb:latest and whatever it's got. Synology's newer model and operating system is running 4.4.180+ which wouldn't help either. (*nix is not my forte but you've probably guessed that by now.)
            danblack Daniel Black added a comment - - edited

            You are right that a < 5.1 kernel won't have uring, so its revered to a simulated AIO.

            Instead of mariadb:latest, can you run the container quay.io/mariadb-foundation/mariadb-debug:10.8 (same interface with --cap-add CAP_SYS_PTRACE in the docker options (might need CAP_ removed) when starting the container.

            Before doing the backup run the following and leave it running:

            sudo docker exec -ti $CONTAINER_NAME gdb -p 1
            (gdb) c
            

            "c" is continue the execution.

            Run the backup, and the gdb should be stalled at this location with as the assertion happened.

            (gdb) thread apply all bt -frame-arguments all full and capture this information and include here.

            Go:
            (gdb) up

            until you are in the io_callback function.
            (gdb) p *cb

            will show the contents of this including the m_err value that marko and would like to see along with the function that was in progress.

            danblack Daniel Black added a comment - - edited You are right that a < 5.1 kernel won't have uring, so its revered to a simulated AIO. Instead of mariadb:latest , can you run the container quay.io/mariadb-foundation/mariadb-debug:10.8 (same interface with --cap-add CAP_SYS_PTRACE in the docker options (might need CAP_ removed) when starting the container. Before doing the backup run the following and leave it running: sudo docker exec -ti $CONTAINER_NAME gdb -p 1 (gdb) c "c" is continue the execution. Run the backup, and the gdb should be stalled at this location with as the assertion happened. (gdb) thread apply all bt -frame-arguments all full and capture this information and include here. Go: (gdb) up until you are in the io_callback function. (gdb) p *cb will show the contents of this including the m_err value that marko and would like to see along with the function that was in progress.
            leethompson Lee Thompson added a comment -

            Ended up staying up for something else and took a stab at trying to get debug symbols in the container.

            The mariadb:latest is stripped so I followed the instructions but it ended in failure.

            Suggestion for MariaDB: Make debug images. If I could swap out with a mariadb:debug_latest image, it would make this a lot easier for all of us. Especially those of us on systems where building a custom container is not much of an option.

            sudo add-apt-repository 'deb [arch=amd64,arm64,ppc64el,s390x]  https://ftp.osuosl.org/pub/mariadb/repo/10.5/ubuntu focal main/debug'
            

            This failed in the container for two reasons. sudo is not there. add-apt-repository is not there.

            root@MariaDB:/# apt-get update && apt-get install -y mariadb-server-core-10.8.3-dbgsym 
            Get:1 http://archive.ubuntu.com/ubuntu jammy InRelease [270 kB]                                     
            Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]                                    
            Get:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [109 kB]                                      
            Get:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [99.8 kB]               
            Get:3 https://archive.mariadb.org/mariadb-10.8.3/repo/ubuntu jammy InRelease [7728 B]
            Get:6 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages [1792 kB]
            Get:7 http://security.ubuntu.com/ubuntu jammy-security/multiverse amd64 Packages [4648 B]
            Get:8 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [212 kB]
            Get:9 http://archive.ubuntu.com/ubuntu jammy/restricted amd64 Packages [164 kB] 
            Get:10 http://archive.ubuntu.com/ubuntu jammy/multiverse amd64 Packages [266 kB]                              
            Get:11 http://archive.ubuntu.com/ubuntu jammy/universe amd64 Packages [17.5 MB]                                
            Get:12 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [229 kB]                            
            Get:13 https://archive.mariadb.org/mariadb-10.8.3/repo/ubuntu jammy/main amd64 Packages [9823 B]                                                    
            Get:14 http://security.ubuntu.com/ubuntu jammy-security/universe amd64 Packages [89.8 kB]                                                           
            Get:15 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [153 kB]                                                              
            Get:16 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages [236 kB]                                                            
            Get:17 http://archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages [4648 B]                                                            
            Get:18 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [380 kB]                                                                  
            Get:19 http://archive.ubuntu.com/ubuntu jammy-backports/universe amd64 Packages [2016 B]                                                            
            Fetched 21.6 MB in 13s (1640 kB/s)                                                                                                                  
            Reading package lists... Done
            Reading package lists... Done
            Building dependency tree... Done
            Reading state information... Done
            E: Unable to locate package mariadb-server-core-10.8.3-dbgsym
            E: Couldn't find any package by glob 'mariadb-server-core-10.8.3-dbgsym'
            E: Couldn't find any package by regex 'mariadb-server-core-10.8.3-dbgsym'
            
            

            leethompson Lee Thompson added a comment - Ended up staying up for something else and took a stab at trying to get debug symbols in the container. The mariadb:latest is stripped so I followed the instructions but it ended in failure. Suggestion for MariaDB: Make debug images. If I could swap out with a mariadb:debug_latest image, it would make this a lot easier for all of us. Especially those of us on systems where building a custom container is not much of an option. sudo add-apt-repository 'deb [arch=amd64,arm64,ppc64el,s390x] https://ftp.osuosl.org/pub/mariadb/repo/10.5/ubuntu focal main/debug' This failed in the container for two reasons. sudo is not there. add-apt-repository is not there. root@MariaDB:/# apt-get update && apt-get install -y mariadb-server-core-10.8.3-dbgsym Get:1 http://archive.ubuntu.com/ubuntu jammy InRelease [270 kB] Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB] Get:4 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [109 kB] Get:5 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [99.8 kB] Get:3 https://archive.mariadb.org/mariadb-10.8.3/repo/ubuntu jammy InRelease [7728 B] Get:6 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages [1792 kB] Get:7 http://security.ubuntu.com/ubuntu jammy-security/multiverse amd64 Packages [4648 B] Get:8 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [212 kB] Get:9 http://archive.ubuntu.com/ubuntu jammy/restricted amd64 Packages [164 kB] Get:10 http://archive.ubuntu.com/ubuntu jammy/multiverse amd64 Packages [266 kB] Get:11 http://archive.ubuntu.com/ubuntu jammy/universe amd64 Packages [17.5 MB] Get:12 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [229 kB] Get:13 https://archive.mariadb.org/mariadb-10.8.3/repo/ubuntu jammy/main amd64 Packages [9823 B] Get:14 http://security.ubuntu.com/ubuntu jammy-security/universe amd64 Packages [89.8 kB] Get:15 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [153 kB] Get:16 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages [236 kB] Get:17 http://archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages [4648 B] Get:18 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [380 kB] Get:19 http://archive.ubuntu.com/ubuntu jammy-backports/universe amd64 Packages [2016 B] Fetched 21.6 MB in 13s (1640 kB/s) Reading package lists... Done Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package mariadb-server-core-10.8.3-dbgsym E: Couldn't find any package by glob 'mariadb-server-core-10.8.3-dbgsym' E: Couldn't find any package by regex 'mariadb-server-core-10.8.3-dbgsym'

            leethompson, MariaDB supplies packages for many operating systems. I am not familiar with containers, so I do not know if this is relevant or applicable, but: If there is a Docker container of MariaDB based on Ubuntu 20.04 instead of 22.04, that one should use libaio instead of liburing.

            Since btrfs was mentioned, this might be related to MDEV-24854 and you could try innodb_flush_method=fsync to disable the use of O_DIRECT. But I would still like to know the error code.

            marko Marko Mäkelä added a comment - leethompson , MariaDB supplies packages for many operating systems. I am not familiar with containers, so I do not know if this is relevant or applicable, but: If there is a Docker container of MariaDB based on Ubuntu 20.04 instead of 22.04, that one should use libaio instead of liburing . Since btrfs was mentioned, this might be related to MDEV-24854 and you could try innodb_flush_method=fsync to disable the use of O_DIRECT . But I would still like to know the error code.
            leethompson Lee Thompson added a comment -

            @Daniel Black, I will do that. I don't want my stuff to be down for a long time so what I'll do is setup the debug container with a clone of the data from the main one and work on getting you the information you need.

            leethompson Lee Thompson added a comment - @Daniel Black, I will do that. I don't want my stuff to be down for a long time so what I'll do is setup the debug container with a clone of the data from the main one and work on getting you the information you need.
            danblack Daniel Black added a comment -

            The server debug symbol package does exist - https://archive.mariadb.org/mariadb-10.8.3/repo/ubuntu/pool/main/m/mariadb-10.8/mariadb-server-core-10.8-dbgsym_10.8.3%2Bmaria~jammy_amd64.ddeb , its just missing in the repo information somehow. Downloading and installing with dpkg -i should work.

            marko containers are only the userspace and not the kernel, hence mariadbd: io_uring_queue_init() failed with ENOSYS, because the kernel interface.

            If you want to build your own focal (20.04) based container for 10.8.3 - https://github.com/grooverdan/mariadb/tree/focal_images/10.8-focal.

            danblack Daniel Black added a comment - The server debug symbol package does exist - https://archive.mariadb.org/mariadb-10.8.3/repo/ubuntu/pool/main/m/mariadb-10.8/mariadb-server-core-10.8-dbgsym_10.8.3%2Bmaria~jammy_amd64.ddeb , its just missing in the repo information somehow. Downloading and installing with dpkg -i should work. marko containers are only the userspace and not the kernel, hence mariadbd: io_uring_queue_init() failed with ENOSYS , because the kernel interface. If you want to build your own focal (20.04) based container for 10.8.3 - https://github.com/grooverdan/mariadb/tree/focal_images/10.8-focal .
            danblack Daniel Black added a comment -

            add-apt-repository if you use 10.8 instead of 10.5 and jammy instead of focal this should work as a repository directly.

            danblack Daniel Black added a comment - add-apt-repository if you use 10.8 instead of 10.5 and jammy instead of focal this should work as a repository directly.
            leethompson Lee Thompson added a comment -

            @Daniel Black, Having trouble with the container building, the Synology Diskstation DS1817+ has it's own Docker UI which is somewhat limited. I've now got Portainer working and should be able to use that.

            I have never built my own container image myself so this may take some time.

            leethompson Lee Thompson added a comment - @Daniel Black, Having trouble with the container building, the Synology Diskstation DS1817+ has it's own Docker UI which is somewhat limited. I've now got Portainer working and should be able to use that. I have never built my own container image myself so this may take some time.
            danblack Daniel Black added a comment -

            with MDEV-27593 resolve the code path around this assertion and all stacks above have significantly improved.

            Is this still an issue?

            danblack Daniel Black added a comment - with MDEV-27593 resolve the code path around this assertion and all stacks above have significantly improved. Is this still an issue?
            ashishcyrus ashish added a comment -

            yes, this issue is happening every month

            ashishcyrus ashish added a comment - yes, this issue is happening every month

            ashishcyrus, which version are you using? Note that the 10.8 series was a short-term support release and is no longer supported. Can you post some fresh logs?

            marko Marko Mäkelä added a comment - ashishcyrus , which version are you using? Note that the 10.8 series was a short-term support release and is no longer supported. Can you post some fresh logs?

            People

              danblack Daniel Black
              leethompson Lee Thompson
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.