Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35024

WSREP: State transfer failed: -22 (Invalid argument)

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 10.11.9
    • None
    • Galera
    • Ubuntu 22.04.5

    Description

      Hi MariaDB-team,

      we have a five node MariaDB Galera Cluster.
      After taking one node out of service an clearing data directories and start MariaDB the transfer starts but after about 90% it fails with:

      ...
      Sep 27 09:20:58 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:20:58 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 19935177778, "done": 19935177778, "indefinite": -1 }'
      Sep 27 09:21:04 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:04 0 [Note] WSREP: REPORTING SST PROGRESS: '{ "from": 1, "to": 3, "total": 19947753124, "done": 19947753124, "indefinite": -1 }'
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Warning] WSREP: 0.0 (maria-muc-1): State transfer to 3.0 (maria-ham-3) failed: -22 (Invalid argument)
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [ERROR] WSREP: ./gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():1217: Will never receive state. Need to abort.
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Note] WSREP: gcomm: terminating thread
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Note] WSREP: gcomm: joining thread
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Note] WSREP: gcomm: closing backend
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Note] WSREP: view(view_id(NON_PRIM,0abc8cb4-9ae0,609) memb {
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]:         a591b16e-9113,0
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: } joined {
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: } left {
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: } partitioned {
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]:         0abc8cb4-9ae0,0
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]:         29958514-86e6,0
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]:         7150c82e-8f59,0
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]:         dfc13b8b-8838,0
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: })
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Note] WSREP: PC protocol downgrade 1 -> 0
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Note] WSREP: view((empty))
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Note] WSREP: gcomm: closed
      Sep 27 09:21:08 maria-ham-3 mariadbd[2442465]: 2024-09-27  9:21:08 0 [Note] WSREP: /usr/sbin/mariadbd: Terminated.
      Sep 27 09:21:08 maria-ham-3 systemd[1]: mariadb.service: Main process exited, code=killed, status=6/ABRT
      Sep 27 09:21:08 maria-ham-3 systemd[1]: mariadb.service: Failed with result 'signal'.
      Sep 27 09:21:08 maria-ham-3 systemd[1]: Failed to start MariaDB 10.11.9 database server.
      

      This error comes again at each try to restart the transfer.

      Versions:

      # dpkg -l galera-4 mariadb-server
      Desired=Unknown/Install/Remove/Purge/Hold
      | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
      |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
      ||/ Name           Version                 Architecture Description
      +++-==============-=======================-============-====================================================
      ii  galera-4       26.4.19-ubu2204         amd64        Replication framework for transactional applications
      ii  mariadb-server 1:10.11.9+maria~ubu2204 amd64        MariaDB database server binaries
      

      I set the wsrep_sst_donor on node maria-ham-3 to maria-muc-1 to make debugging easier.
      The filesystems have enough space and I can't find any reason why the transfer stops after it started successfully.

      Can you give me any hint where to look or what to do for further debugging?
      Greetings
      Lars

      Attachments

        Activity

          People

            Unassigned Unassigned
            lollypop Lars Timmann
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.