Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22668

"Flush SSL" command doesn't reload wsrep cert

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.4(EOL)
    • 10.4.19, 10.5.10
    • SSL, wsrep
    • None

    Description

      "Flush SSL" is an awesome feature. It's nice to be able to renew TLS cert without restart servers. But looks like it is not triggering reload cert file from wsrep_provider_options. It will break replication when we need to restart a cluster node.
      Should be nice to be able to call wsrep ssl reload.

      Note: I am running MariaDB 10.4.13

      Attachments

        Activity

          Review and testing.

          jplindst Jan Lindström (Inactive) added a comment - Review and testing.

          This will be added on a bit by others, but to summarize some information:

          FLUSH SSL is not properly triggering the galera reset in our environment, and last night this caused our production DB clusters to fail.

          Investigation shows that even in 10.5.10.7, FLUSH SSL is only causing the MariaDB port 3306 to pick up the new certificate, while WSREP on port 4567 keeps the old certificate.

          Running 'SET GLOBAL wsrep_provider_options = 'socket.ssl_reload=1';' as the DB root user does cause WSREP on port 4567 to pick up the new certificate.

          Looking at the test case added in commit c3b016efde4b1e0c2b85ca26c814ad43f5611ab2, I see that it only ever tests to see if reconnection is possible after running the SET GLOBAL, but while it does later run a FLUSH SSL, it then immediately goes into clean up instead of testing to see if that worked properly.

          As such, I'm pretty sure that this needs to be reopened, and people trying to use this feature need to be aware that FLUSH SSL is still insufficient when using WSREP, and that a workaround is currently possible by adding the SET GLOBAL to the sequence.

          zelch Zephaniah Loss-Cutler-Hull added a comment - This will be added on a bit by others, but to summarize some information: FLUSH SSL is not properly triggering the galera reset in our environment, and last night this caused our production DB clusters to fail. Investigation shows that even in 10.5.10.7, FLUSH SSL is only causing the MariaDB port 3306 to pick up the new certificate, while WSREP on port 4567 keeps the old certificate. Running 'SET GLOBAL wsrep_provider_options = 'socket.ssl_reload=1';' as the DB root user does cause WSREP on port 4567 to pick up the new certificate. Looking at the test case added in commit c3b016efde4b1e0c2b85ca26c814ad43f5611ab2, I see that it only ever tests to see if reconnection is possible after running the SET GLOBAL, but while it does later run a FLUSH SSL, it then immediately goes into clean up instead of testing to see if that worked properly. As such, I'm pretty sure that this needs to be reopened, and people trying to use this feature need to be aware that FLUSH SSL is still insufficient when using WSREP, and that a workaround is currently possible by adding the SET GLOBAL to the sequence.

          Hi Zephaniah ,

          Can you run with wsrep debugging enabled (wsrep_debug=1) and report back trace.

          mkaruza Mario Karuza (Inactive) added a comment - Hi Zephaniah , Can you run with wsrep debugging enabled (wsrep_debug=1) and report back trace.

          Hi, there was problem with galera library. It should be fixed in new version.

          mkaruza Mario Karuza (Inactive) added a comment - Hi, there was problem with galera library. It should be fixed in new version.

          My apologies for the delay in getting the debug output, is that still needed?

          And do you have the fix commit on the galera library?

          zelch Zephaniah Loss-Cutler-Hull added a comment - My apologies for the delay in getting the debug output, is that still needed? And do you have the fix commit on the galera library?

          People

            jplindst Jan Lindström (Inactive)
            rmelo Ricardo Melo
            Votes:
            3 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.