Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35666

mariadb-galera all database files are missing

Details

    Description

      My environment is a virtual machine environment. The machine performance is not good, and mariadb-galera is deployed on three nodes in the form of containers. After all the containers crashed, a randomly selected node was restarted, and occasionally all the files in the mysql directory of the three nodes were gone. Suspicion is SST or something else.Have you encountered similar issues in other versions?

      Attachments

        1. image-2024-12-18-16-04-27-042.png
          image-2024-12-18-16-04-27-042.png
          251 kB
        2. image-2024-12-18-16-04-34-782.png
          image-2024-12-18-16-04-34-782.png
          253 kB
        3. image-2024-12-18-16-04-39-889.png
          image-2024-12-18-16-04-39-889.png
          270 kB
        4. mariadb-galera-0.log
          3 kB
        5. mariadb-galera-1.log
          4 kB
        6. mariadb-galera-2.log
          3 kB
        7. mysqld0.log
          4.18 MB
        8. mysqld1.log
          3.76 MB
        9. mysqld2.log
          1.66 MB
        10. screenshot-1.png
          screenshot-1.png
          11 kB
        11. screenshot-2.png
          screenshot-2.png
          124 kB

        Activity

          danblack Daniel Black added a comment -

          Some container logs would provide some searchable material of what might have gone wrong. Can you attach these as text files (not images)?

          Take a look for similar issues in the release notes from 10.6.8 onwards.

          https://mariadb.com/kb/en/release-notes-mariadb-106-series/

          Also note we aren't going to be providing Centos7 releases any more - https://mariadb.com/kb/en/mariadb-platform-deprecation-policy/.

          danblack Daniel Black added a comment - Some container logs would provide some searchable material of what might have gone wrong. Can you attach these as text files (not images)? Take a look for similar issues in the release notes from 10.6.8 onwards. https://mariadb.com/kb/en/release-notes-mariadb-106-series/ Also note we aren't going to be providing Centos7 releases any more - https://mariadb.com/kb/en/mariadb-platform-deprecation-policy/ .
          luorui luorui added a comment -

          mariadb-galera-0.log mariadb-galera-1.log mariadb-galera-2.log mysqld0.log mysqld1.log mysqld2.log

          Restart the three-node system, the data of the database node is gone, the above is the corresponding log.

          luorui luorui added a comment - mariadb-galera-0.log mariadb-galera-1.log mariadb-galera-2.log mysqld0.log mysqld1.log mysqld2.log Restart the three-node system, the data of the database node is gone, the above is the corresponding log.
          luorui luorui added a comment -

          We use helm to deploy bitnami/mariadb-galera to the k8s cluster, the k8s version is 1.23.9, and the containerd is 1.6.26

          luorui luorui added a comment - We use helm to deploy bitnami/mariadb-galera to the k8s cluster, the k8s version is 1.23.9, and the containerd is 1.6.26
          danblack Daniel Black added a comment - - edited

          FYI https://web.archive.org/web/20230129122911/https://blog.jozefrebjak.com/how-to-run-mariadb-galera-cluster-with-docker

          I don't know the current status there.

          There is https://github.com/mariadb-operator/mariadb-operator that manages galera too that is actively maintained.

          /bin/pt-galera-log-explainer list --all --since '2024-12-05T10:30:10Z' ~/Downloads/mysqld[012].log

          identifier            mariadb-galera-0                           mariadb-galera-1                                mariadb-galera-2                           
          current path          /home/dan/Downloads/mysqld0.log            /home/dan/Downloads/mysqld1.log                 /home/dan/Downloads/mysqld2.log            
          last known ip         192.168.241.113                                       192.168.235.34                                        192.168.59.252    
          last known name       mariadb-galera-0                           mariadb-galera-1                                mariadb-galera-2                           
                                                                                                                                                                      
          2024-12-05 10:30:10   mariadb-galera-2 suspected to be down      mariadb-galera-2 suspected to be down           |                                          
          2024-12-05 10:30:11   mariadb-galera-1 joined                    mariadb-galera-0 joined                         |                                          
          2024-12-05 10:30:11   |                                          mariadb-galera-2 left                           |                                          
          2024-12-05 10:30:11   PRIMARY(n=2)                               |                                               |                                          
          2024-12-05 10:30:11   mariadb-galera-2 left                      PRIMARY(n=2)                                    |                                          
          2024-12-05 10:30:31   |                                          |                                               inactive check more than 1.5s (25.7908s)   
                                |                                          |                                               inactive check more than 1.5s (1.51535s)   
          2024-12-05 10:30:33   |                                          |                                               mariadb-galera-0 suspected to be down      
          2024-12-05 10:30:34   |                                          |                                               NON-PRIMARY(n=1)                           
          2024-12-05 10:30:34   |                                          |                                               SYNCED -> OPEN                             
          2024-12-05 10:30:34   |                                          |                                               NON-PRIMARY(n=1)                           
                                mariadb-galera-1 joined                    mariadb-galera-0 joined                         |                                          
          2024-12-05 10:30:36   mariadb-galera-2 joined                    mariadb-galera-2 joined                         |                                          
                                |                                          PRIMARY(n=3)                                    |                                          
          2024-12-05 10:30:36   |                                          |                                               mariadb-galera-0 joined                    
          2024-12-05 10:30:36   |                                          |                                               mariadb-galera-1 joined                    
          2024-12-05 10:30:36   |                                          |                                               PRIMARY(n=3)                               
          2024-12-05 10:30:37   |                                          |                                               OPEN -> PRIMARY                            
          2024-12-05 10:30:37   PRIMARY(n=3)                               |                                               |                                          
                                |                                          |                                               will receive IST(seqno:2082)               
                                |                                          |                                               mariadb-galera-0 will resync local node    
                                |                                          |                                               PRIMARY -> JOINER                          
                                |                                          |                                               got SST from mariadb-galera-0              
          2024-12-05 10:30:38   local node will resync mariadb-galera-2    mariadb-galera-0 will resync mariadb-galera-2   |                                          
          2024-12-05 10:30:38   SYNCED -> DONOR                            mariadb-galera-0 synced mariadb-galera-2        |                                          
          2024-12-05 10:30:38   IST to mariadb-galera-2(seqno:2082)        |                                               |                                          
          2024-12-05 10:30:38   finished sending IST to mariadb-galera-2   |                                               |                                          
          2024-12-05 10:30:38   DESYNCED -> JOINED                         |                                               |                                          
          2024-12-05 10:30:38   JOINED -> SYNCED                           |                                               |                                          
          2024-12-05 10:30:47   |                                          |                                               received shutdown                          
          2024-12-05 10:30:49   mariadb-galera-1 suspected to be down      |                                               mariadb-galera-1 suspected to be down      
          2024-12-05 10:30:50   NON-PRIMARY(n=1)                           |                                               NON-PRIMARY(n=1)                           
          2024-12-05 10:30:50   SYNCED -> OPEN                             |                                               JOINER -> OPEN                             
          2024-12-05 10:30:50   |                                          |                                               OPEN -> CLOSED                             
          2024-12-05 10:30:50   mariadb-galera-2 left                      |                                               |                                          
          2024-12-05 10:30:50   NON-PRIMARY(n=1)                           |                                               |                                          
          2024-12-05 10:30:53   |                                          |                                               IST received(seqno:2082)                   
          2024-12-05 10:30:54   |                                          |                                               shutdown complete                          
          2024-12-05 10:31:14   received shutdown                          |                                               |                                          
          2024-12-05 10:31:14   OPEN -> CLOSED                             |                                               |                                          
          2024-12-05 10:31:16   shutdown complete                          |                                               |                                          
          2024-12-05 10:31:32   |                                          inactive check more than 1.5s (47.837s)         |                                          
          2024-12-05 10:31:34   |                                          NON-PRIMARY(n=1)                                |                                          
          2024-12-05 10:31:34   |                                          SYNCED -> OPEN                                  |                                          
          2024-12-05 10:31:34   |                                          NON-PRIMARY(n=1)                                |                                          
          2024-12-05 10:32:07   |                                          received shutdown                               |                                          
          2024-12-05 10:32:08   |                                          OPEN -> CLOSED                                  |                                          
          2024-12-05 10:32:11   |                                          shutdown complete                               |                                          
          

          So galera-2 timed out:

          2024-12-05 10:30:30 0 [Note] WSREP: (9e537da6-895d, 'tcp://0.0.0.0:4567') connection to peer 8e07881d-841b with addr tcp://192.168.235.34:4567 timed out, no messages seen in PT3S, socket stats: rtt: 14643 rttvar: 17547 rto: 215000 lost: 0 last_data_recv: 21621 cwnd: 8 last_queued_since: 1519241031 last_delivered_since: 24133431088 send_queue_length: 7 send_queue_bytes: 588 segment: 0 messages: 7
          

          It did a sst recovery from galera-0, all seemed successful and then was immediately shutdown (by bintnami? hard so say - no logs after the startup on 07:55).

          Frequent errors:

          Slave SQL: Error 'Can't create table `idatafusion`.`agent` (errno: 121 "Duplicate key on write or update")' on query. Default database: 'idatafusion'. Query: 'ALTER TABLE `agent` ADD CONSTRAINT `fk_agent_version_info` FOREIGN KEY (`version_id`) REFERENCES `version`(`id`)', Internal MariaDB error code: 1005
          

          Unsure if this was meant to replicate. But next log message says it was ignored.

          I can't see anything around data disappearing, only the container being shutdown.

          You have helm configured to make /bitnami/mariadb/data a persistent volume right?

          danblack Daniel Black added a comment - - edited FYI https://web.archive.org/web/20230129122911/https://blog.jozefrebjak.com/how-to-run-mariadb-galera-cluster-with-docker I don't know the current status there. There is https://github.com/mariadb-operator/mariadb-operator that manages galera too that is actively maintained. /bin/pt-galera-log-explainer list --all --since '2024-12-05T10:30:10Z' ~/Downloads/mysqld[012].log identifier mariadb-galera-0 mariadb-galera-1 mariadb-galera-2 current path /home/dan/Downloads/mysqld0.log /home/dan/Downloads/mysqld1.log /home/dan/Downloads/mysqld2.log last known ip 192.168.241.113 192.168.235.34 192.168.59.252 last known name mariadb-galera-0 mariadb-galera-1 mariadb-galera-2 2024-12-05 10:30:10 mariadb-galera-2 suspected to be down mariadb-galera-2 suspected to be down | 2024-12-05 10:30:11 mariadb-galera-1 joined mariadb-galera-0 joined | 2024-12-05 10:30:11 | mariadb-galera-2 left | 2024-12-05 10:30:11 PRIMARY(n=2) | | 2024-12-05 10:30:11 mariadb-galera-2 left PRIMARY(n=2) | 2024-12-05 10:30:31 | | inactive check more than 1.5s (25.7908s) | | inactive check more than 1.5s (1.51535s) 2024-12-05 10:30:33 | | mariadb-galera-0 suspected to be down 2024-12-05 10:30:34 | | NON-PRIMARY(n=1) 2024-12-05 10:30:34 | | SYNCED -> OPEN 2024-12-05 10:30:34 | | NON-PRIMARY(n=1) mariadb-galera-1 joined mariadb-galera-0 joined | 2024-12-05 10:30:36 mariadb-galera-2 joined mariadb-galera-2 joined | | PRIMARY(n=3) | 2024-12-05 10:30:36 | | mariadb-galera-0 joined 2024-12-05 10:30:36 | | mariadb-galera-1 joined 2024-12-05 10:30:36 | | PRIMARY(n=3) 2024-12-05 10:30:37 | | OPEN -> PRIMARY 2024-12-05 10:30:37 PRIMARY(n=3) | | | | will receive IST(seqno:2082) | | mariadb-galera-0 will resync local node | | PRIMARY -> JOINER | | got SST from mariadb-galera-0 2024-12-05 10:30:38 local node will resync mariadb-galera-2 mariadb-galera-0 will resync mariadb-galera-2 | 2024-12-05 10:30:38 SYNCED -> DONOR mariadb-galera-0 synced mariadb-galera-2 | 2024-12-05 10:30:38 IST to mariadb-galera-2(seqno:2082) | | 2024-12-05 10:30:38 finished sending IST to mariadb-galera-2 | | 2024-12-05 10:30:38 DESYNCED -> JOINED | | 2024-12-05 10:30:38 JOINED -> SYNCED | | 2024-12-05 10:30:47 | | received shutdown 2024-12-05 10:30:49 mariadb-galera-1 suspected to be down | mariadb-galera-1 suspected to be down 2024-12-05 10:30:50 NON-PRIMARY(n=1) | NON-PRIMARY(n=1) 2024-12-05 10:30:50 SYNCED -> OPEN | JOINER -> OPEN 2024-12-05 10:30:50 | | OPEN -> CLOSED 2024-12-05 10:30:50 mariadb-galera-2 left | | 2024-12-05 10:30:50 NON-PRIMARY(n=1) | | 2024-12-05 10:30:53 | | IST received(seqno:2082) 2024-12-05 10:30:54 | | shutdown complete 2024-12-05 10:31:14 received shutdown | | 2024-12-05 10:31:14 OPEN -> CLOSED | | 2024-12-05 10:31:16 shutdown complete | | 2024-12-05 10:31:32 | inactive check more than 1.5s (47.837s) | 2024-12-05 10:31:34 | NON-PRIMARY(n=1) | 2024-12-05 10:31:34 | SYNCED -> OPEN | 2024-12-05 10:31:34 | NON-PRIMARY(n=1) | 2024-12-05 10:32:07 | received shutdown | 2024-12-05 10:32:08 | OPEN -> CLOSED | 2024-12-05 10:32:11 | shutdown complete | So galera-2 timed out: 2024-12-05 10:30:30 0 [Note] WSREP: (9e537da6-895d, 'tcp://0.0.0.0:4567') connection to peer 8e07881d-841b with addr tcp://192.168.235.34:4567 timed out, no messages seen in PT3S, socket stats: rtt: 14643 rttvar: 17547 rto: 215000 lost: 0 last_data_recv: 21621 cwnd: 8 last_queued_since: 1519241031 last_delivered_since: 24133431088 send_queue_length: 7 send_queue_bytes: 588 segment: 0 messages: 7 It did a sst recovery from galera-0, all seemed successful and then was immediately shutdown (by bintnami? hard so say - no logs after the startup on 07:55). Frequent errors: Slave SQL: Error 'Can't create table `idatafusion`.`agent` (errno: 121 "Duplicate key on write or update")' on query. Default database: 'idatafusion'. Query: 'ALTER TABLE `agent` ADD CONSTRAINT `fk_agent_version_info` FOREIGN KEY (`version_id`) REFERENCES `version`(`id`)', Internal MariaDB error code: 1005 Unsure if this was meant to replicate. But next log message says it was ignored. I can't see anything around data disappearing, only the container being shutdown. You have helm configured to make /bitnami/mariadb/data a persistent volume right?
          luorui luorui added a comment - - edited

          yes, /bitnami/mariadb/data is a persistent volume
          The following file for mariadb-galera-1 after the reboot is as follows

          The following file for mariadb-galera-2 after the rebodot is as follows

          The following file for mariadb-galera-0 is lose

          I looked at the selection of the master to start with maria-galera-1 as the master node.

          I don't know if any of this helps, thanks for the answer

          luorui luorui added a comment - - edited yes, /bitnami/mariadb/data is a persistent volume The following file for mariadb-galera-1 after the reboot is as follows The following file for mariadb-galera-2 after the rebodot is as follows The following file for mariadb-galera-0 is lose I looked at the selection of the master to start with maria-galera-1 as the master node. I don't know if any of this helps, thanks for the answer
          danblack Daniel Black added a comment -

          "The following file for mariadb-galera-0 is lose" is blank.

          The "not a database .sst" could be as result of the SST not completing finishing donating to that node. There was insufficient logs to say if this was definately the case and not enough to suggest what may have occurred sorry.

          Recommend bumping to a later 10.6 version as there have been a considerable amount of fixes and bootstrapping on your latest galera node.

          I'm going to close this as incomplete for now, but if there's further information, especially on a later version this can be examined.

          danblack Daniel Black added a comment - "The following file for mariadb-galera-0 is lose" is blank. The "not a database .sst" could be as result of the SST not completing finishing donating to that node. There was insufficient logs to say if this was definately the case and not enough to suggest what may have occurred sorry. Recommend bumping to a later 10.6 version as there have been a considerable amount of fixes and bootstrapping on your latest galera node. I'm going to close this as incomplete for now, but if there's further information, especially on a later version this can be examined.

          People

            danblack Daniel Black
            luorui luorui
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.