[MDEV-32366] WSREP closes because of high traffic load and other replication Created: 2023-10-06  Updated: 2023-11-06

Status: Open
Project: MariaDB Server
Component/s: Replication, Server, wsrep
Affects Version/s: 10.5.19
Fix Version/s: 10.5

Type: Bug Priority: Critical
Reporter: Daniel Czadek Assignee: Seppo Jaakola
Resolution: Unresolved Votes: 0
Labels: crash, galera, replication
Environment:

Debian 11 VM on VMware vSphere with Storage on NetApp.


Attachments: Text File error.log    

 Description   

This Bug now happend twice and the server stopped to replicate in the galera cluster with wsrep because of an high load (not sure) with an classic mysql replication of the second mariadb server which replicates with an mysql 5.7 server for data migration.

We dont understand why the server stopps to replicate over wsrep but still runs normally and is reachable, queryble, ...
The haproxy in front can still login so it doesnt switch to the other backup server for connections.

This is the problematic one:

MariaDB [(none)]> show status like "%wsrep%";
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name                 | Value                                                                                                                                          |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| wsrep_local_state_uuid        | 00000000-0000-0000-0000-000000000000                                                                                                           |
| wsrep_protocol_version        | 10                                                                                                                                             |
| wsrep_last_committed          | -1                                                                                                                                             |
| wsrep_replicated              | 3928800                                                                                                                                        |
| wsrep_replicated_bytes        | 6318098688                                                                                                                                     |
| wsrep_repl_keys               | 31628277                                                                                                                                       |
| wsrep_repl_keys_bytes         | 347319288                                                                                                                                      |
| wsrep_repl_data_bytes         | 5696592424                                                                                                                                     |
| wsrep_repl_other_bytes        | 0                                                                                                                                              |
| wsrep_received                | 495184                                                                                                                                         |
| wsrep_received_bytes          | 334286424                                                                                                                                      |
| wsrep_local_commits           | 3928751                                                                                                                                        |
| wsrep_local_cert_failures     | 0                                                                                                                                              |
| wsrep_local_replays           | 0                                                                                                                                              |
| wsrep_local_send_queue        | 0                                                                                                                                              |
| wsrep_local_send_queue_max    | 26                                                                                                                                             |
| wsrep_local_send_queue_min    | 0                                                                                                                                              |
| wsrep_local_send_queue_avg    | 0.0146288                                                                                                                                      |
| wsrep_local_recv_queue        | 0                                                                                                                                              |
| wsrep_local_recv_queue_max    | 102                                                                                                                                            |
| wsrep_local_recv_queue_min    | 0                                                                                                                                              |
| wsrep_local_recv_queue_avg    | 0.0784936                                                                                                                                      |
| wsrep_local_cached_downto     | 9752343                                                                                                                                        |
| wsrep_flow_control_paused_ns  | 9770684545                                                                                                                                     |
| wsrep_flow_control_paused     | 4.2607e-06                                                                                                                                     |
| wsrep_flow_control_sent       | 48                                                                                                                                             |
| wsrep_flow_control_recv       | 584                                                                                                                                            |
| wsrep_flow_control_active     | false                                                                                                                                          |
| wsrep_flow_control_requested  | false                                                                                                                                          |
| wsrep_cert_deps_distance      | 23.8889                                                                                                                                        |
| wsrep_apply_oooe              | 0.0216107                                                                                                                                      |
| wsrep_apply_oool              | 0.0038792                                                                                                                                      |
| wsrep_apply_window            | 1.03378                                                                                                                                        |
| wsrep_apply_waits             | 0                                                                                                                                              |
| wsrep_commit_oooe             | 0                                                                                                                                              |
| wsrep_commit_oool             | 0                                                                                                                                              |
| wsrep_commit_window           | 1.00543                                                                                                                                        |
| wsrep_local_state             | 5                                                                                                                                              |
| wsrep_local_state_comment     | Inconsistent                                                                                                                                   |
| wsrep_cert_index_size         | 21                                                                                                                                             |
| wsrep_causal_reads            | 0                                                                                                                                              |
| wsrep_cert_interval           | 152.001                                                                                                                                        |
| wsrep_open_transactions       | 0                                                                                                                                              |
| wsrep_open_connections        | 0                                                                                                                                              |
| wsrep_incoming_addresses      |                                                                                                                                                |
| wsrep_applier_thread_count    | 0                                                                                                                                              |
| wsrep_cluster_capabilities    |                                                                                                                                                |
| wsrep_cluster_conf_id         | 18446744073709551615                                                                                                                           |
| wsrep_cluster_size            | 0                                                                                                                                              |
| wsrep_cluster_state_uuid      | d33e2bef-eb15-11ed-b37c-bbeed863e828                                                                                                           |
| wsrep_cluster_status          | Disconnected                                                                                                                                   |
| wsrep_connected               | OFF                                                                                                                                            |
| wsrep_local_bf_aborts         | 0                                                                                                                                              |
| wsrep_local_index             | 18446744073709551615                                                                                                                           |
| wsrep_provider_capabilities   | :MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:TRX_REPLAY:ISOLATION:PAUSE:CAUSAL_READS:INCREMENTAL_WRITESET:UNORDERED:PREORDERED:STREAMING:NBO: |
| wsrep_provider_name           | Galera                                                                                                                                         |
| wsrep_provider_vendor         | Codership Oy <info@codership.com>                                                                                                              |
| wsrep_provider_version        | 4.11(r7b59af73)                                                                                                                                |
| wsrep_ready                   | OFF                                                                                                                                            |
| wsrep_rollbacker_thread_count | 1                                                                                                                                              |
| wsrep_thread_count            | 1                                                                                                                                              |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+

And the other one:

MariaDB [(none)]> show status like "%wsrep%";
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name                 | Value                                                                                                                                          |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| wsrep_local_state_uuid        | 00000000-0000-0000-0000-000000000000                                                                                                           |
| wsrep_protocol_version        | 10                                                                                                                                             |
| wsrep_last_committed          | -1                                                                                                                                             |
| wsrep_replicated              | 3928800                                                                                                                                        |
| wsrep_replicated_bytes        | 6318098688                                                                                                                                     |
| wsrep_repl_keys               | 31628277                                                                                                                                       |
| wsrep_repl_keys_bytes         | 347319288                                                                                                                                      |
| wsrep_repl_data_bytes         | 5696592424                                                                                                                                     |
| wsrep_repl_other_bytes        | 0                                                                                                                                              |
| wsrep_received                | 495184                                                                                                                                         |
| wsrep_received_bytes          | 334286424                                                                                                                                      |
| wsrep_local_commits           | 3928751                                                                                                                                        |
| wsrep_local_cert_failures     | 0                                                                                                                                              |
| wsrep_local_replays           | 0                                                                                                                                              |
| wsrep_local_send_queue        | 0                                                                                                                                              |
| wsrep_local_send_queue_max    | 26                                                                                                                                             |
| wsrep_local_send_queue_min    | 0                                                                                                                                              |
| wsrep_local_send_queue_avg    | 0.0146288                                                                                                                                      |
| wsrep_local_recv_queue        | 0                                                                                                                                              |
| wsrep_local_recv_queue_max    | 102                                                                                                                                            |
| wsrep_local_recv_queue_min    | 0                                                                                                                                              |
| wsrep_local_recv_queue_avg    | 0.0784936                                                                                                                                      |
| wsrep_local_cached_downto     | 9752343                                                                                                                                        |
| wsrep_flow_control_paused_ns  | 9770684545                                                                                                                                     |
| wsrep_flow_control_paused     | 4.26199e-06                                                                                                                                    |
| wsrep_flow_control_sent       | 48                                                                                                                                             |
| wsrep_flow_control_recv       | 584                                                                                                                                            |
| wsrep_flow_control_active     | false                                                                                                                                          |
| wsrep_flow_control_requested  | false                                                                                                                                          |
| wsrep_cert_deps_distance      | 23.8889                                                                                                                                        |
| wsrep_apply_oooe              | 0.0216107                                                                                                                                      |
| wsrep_apply_oool              | 0.0038792                                                                                                                                      |
| wsrep_apply_window            | 1.03378                                                                                                                                        |
| wsrep_apply_waits             | 0                                                                                                                                              |
| wsrep_commit_oooe             | 0                                                                                                                                              |
| wsrep_commit_oool             | 0                                                                                                                                              |
| wsrep_commit_window           | 1.00543                                                                                                                                        |
| wsrep_local_state             | 5                                                                                                                                              |
| wsrep_local_state_comment     | Inconsistent                                                                                                                                   |
| wsrep_cert_index_size         | 21                                                                                                                                             |
| wsrep_causal_reads            | 0                                                                                                                                              |
| wsrep_cert_interval           | 152.001                                                                                                                                        |
| wsrep_open_transactions       | 0                                                                                                                                              |
| wsrep_open_connections        | 0                                                                                                                                              |
| wsrep_incoming_addresses      |                                                                                                                                                |
| wsrep_applier_thread_count    | 0                                                                                                                                              |
| wsrep_cluster_capabilities    |                                                                                                                                                |
| wsrep_cluster_conf_id         | 18446744073709551615                                                                                                                           |
| wsrep_cluster_size            | 0                                                                                                                                              |
| wsrep_cluster_state_uuid      | d33e2bef-eb15-11ed-b37c-bbeed863e828                                                                                                           |
| wsrep_cluster_status          | Disconnected                                                                                                                                   |
| wsrep_connected               | OFF                                                                                                                                            |
| wsrep_local_bf_aborts         | 0                                                                                                                                              |
| wsrep_local_index             | 18446744073709551615                                                                                                                           |
| wsrep_provider_capabilities   | :MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:TRX_REPLAY:ISOLATION:PAUSE:CAUSAL_READS:INCREMENTAL_WRITESET:UNORDERED:PREORDERED:STREAMING:NBO: |
| wsrep_provider_name           | Galera                                                                                                                                         |
| wsrep_provider_vendor         | Codership Oy <info@codership.com>                                                                                                              |
| wsrep_provider_version        | 4.11(r7b59af73)                                                                                                                                |
| wsrep_ready                   | OFF                                                                                                                                            |
| wsrep_rollbacker_thread_count | 1                                                                                                                                              |
| wsrep_thread_count            | 1                                                                                                                                              |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+



 Comments   
Comment by Daniel Czadek [ 2023-10-06 ]

Sorry, that is actual the other one:

MariaDB [(none)]> show status like '%wsrep%';
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name                 | Value                                                                                                                                          |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| wsrep_local_state_uuid        | d33e2bef-eb15-11ed-b37c-bbeed863e828                                                                                                           |
| wsrep_protocol_version        | 10                                                                                                                                             |
| wsrep_last_committed          | 10517419                                                                                                                                       |
| wsrep_replicated              | 1058536                                                                                                                                        |
| wsrep_replicated_bytes        | 761733056                                                                                                                                      |
| wsrep_repl_keys               | 6846313                                                                                                                                        |
| wsrep_repl_keys_bytes         | 80183688                                                                                                                                       |
| wsrep_repl_data_bytes         | 610293912                                                                                                                                      |
| wsrep_repl_other_bytes        | 0                                                                                                                                              |
| wsrep_received                | 3982045                                                                                                                                        |
| wsrep_received_bytes          | 6318538912                                                                                                                                     |
| wsrep_local_commits           | 1058506                                                                                                                                        |
| wsrep_local_cert_failures     | 0                                                                                                                                              |
| wsrep_local_replays           | 0                                                                                                                                              |
| wsrep_local_send_queue        | 0                                                                                                                                              |
| wsrep_local_send_queue_max    | 2                                                                                                                                              |
| wsrep_local_send_queue_min    | 0                                                                                                                                              |
| wsrep_local_send_queue_avg    | 6.31044e-06                                                                                                                                    |
| wsrep_local_recv_queue        | 0                                                                                                                                              |
| wsrep_local_recv_queue_max    | 37                                                                                                                                             |
| wsrep_local_recv_queue_min    | 0                                                                                                                                              |
| wsrep_local_recv_queue_avg    | 0.127858                                                                                                                                       |
| wsrep_local_cached_downto     | 10330416                                                                                                                                       |
| wsrep_flow_control_paused_ns  | 9756498133                                                                                                                                     |
| wsrep_flow_control_paused     | 4.25587e-06                                                                                                                                    |
| wsrep_flow_control_sent       | 536                                                                                                                                            |
| wsrep_flow_control_recv       | 584                                                                                                                                            |
| wsrep_flow_control_active     | false                                                                                                                                          |
| wsrep_flow_control_requested  | false                                                                                                                                          |
| wsrep_cert_deps_distance      | 139.337                                                                                                                                        |
| wsrep_apply_oooe              | 2.30584e-05                                                                                                                                    |
| wsrep_apply_oool              | 1.54391e-05                                                                                                                                    |
| wsrep_apply_window            | 1.00002                                                                                                                                        |
| wsrep_apply_waits             | 1                                                                                                                                              |
| wsrep_commit_oooe             | 0                                                                                                                                              |
| wsrep_commit_oool             | 0                                                                                                                                              |
| wsrep_commit_window           | 1.00002                                                                                                                                        |
| wsrep_local_state             | 4                                                                                                                                              |
| wsrep_local_state_comment     | Synced                                                                                                                                         |
| wsrep_cert_index_size         | 2597                                                                                                                                           |
| wsrep_causal_reads            | 0                                                                                                                                              |
| wsrep_cert_interval           | 0.137196                                                                                                                                       |
| wsrep_open_transactions       | 1                                                                                                                                              |
| wsrep_open_connections        | 0                                                                                                                                              |
| wsrep_incoming_addresses      | 10.49.97.153:3306                                                                                                                              |
| wsrep_cluster_weight          | 1                                                                                                                                              |
| wsrep_desync_count            | 0                                                                                                                                              |
| wsrep_evs_delayed             |                                                                                                                                                |
| wsrep_evs_evict_list          |                                                                                                                                                |
| wsrep_evs_repl_latency        | 8.21e-07/2.87402e-06/0.00214963/1.0055e-05/47122                                                                                               |
| wsrep_evs_state               | OPERATIONAL                                                                                                                                    |
| wsrep_gcomm_uuid              | 40e0f479-4f3c-11ee-a03c-2b542d001686                                                                                                           |
| wsrep_gmcast_segment          | 0                                                                                                                                              |
| wsrep_applier_thread_count    | 1                                                                                                                                              |
| wsrep_cluster_capabilities    |                                                                                                                                                |
| wsrep_cluster_conf_id         | 28                                                                                                                                             |
| wsrep_cluster_size            | 1                                                                                                                                              |
| wsrep_cluster_state_uuid      | d33e2bef-eb15-11ed-b37c-bbeed863e828                                                                                                           |
| wsrep_cluster_status          | Primary                                                                                                                                        |
| wsrep_connected               | ON                                                                                                                                             |
| wsrep_local_bf_aborts         | 0                                                                                                                                              |
| wsrep_local_index             | 0                                                                                                                                              |
| wsrep_provider_capabilities   | :MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:TRX_REPLAY:ISOLATION:PAUSE:CAUSAL_READS:INCREMENTAL_WRITESET:UNORDERED:PREORDERED:STREAMING:NBO: |
| wsrep_provider_name           | Galera                                                                                                                                         |
| wsrep_provider_vendor         | Codership Oy <info@codership.com>                                                                                                              |
| wsrep_provider_version        | 4.11(r7b59af73)                                                                                                                                |
| wsrep_ready                   | ON                                                                                                                                             |
| wsrep_rollbacker_thread_count | 1                                                                                                                                              |
| wsrep_thread_count            | 2                                                                                                                                              |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+

And here are also some config parameters of both servers:

The global variables of ansible for both:

## config settings 32G
mariadb_conf_bind_adress: "0.0.0.0"
mariadb_conf_bind_port: "3306"
mariadb_conf_max_connections: "1250"
 
## Memory settings 32G
mariadb_conf_key_buffer_size: "256M"
mariadb_conf_table_open_cache: "8192"
mariadb_conf_table_definition_cache: "4096"
mariadb_conf_thread_cache_size: "32"
mariadb_conf_max_user_connections: "0"
mariadb_conf_tmp_table_size: "32M"
mariadb_conf_max_heap_table_size: "32M"
mariadb_conf_join_buffer_size: "1M"
mariadb_conf_sort_buffer_size: "2M"
mariadb_conf_read_rnd_buffer_size: "1M"
 
## InnoDB settings 32G
mariadb_conf_innodb_buffer_pool_size: "8G"
mariadb_conf_innodb_buffer_pool_instances: "16"
mariadb_conf_innodb_thread_concurrency: "8"
mariadb_conf_innodb_log_file_size: "1G"
mariadb_conf_innodb_log_buffer_size: "16M"
mariadb_conf_innodb_flush_log_at_trx_commit: "2"
mariadb_conf_innodb_lock_wait_timeout: "300"
mariadb_conf_innodb_io_capacity: "4000"
mariadb_conf_innodb_io_capacity_max: "8000"
mariadb_conf_innodb_read_io_threads: "8"
mariadb_conf_innodb_write_io_threads: "8"
 
## mysqldump settings
mariadb_conf_mysqldump_max_allowed_packet: "1G"

And the galera configs:

[mysqld]
####################################
# mysql/mariadb settings
 
binlog_format            = ROW
default-storage-engine   = innodb
innodb_autoinc_lock_mode = 2
innodb_file_per_table    = on
binlog-row-image         = minimal
 
innodb_doublewrite  = 1
 
####################################
# galera settings
 
wsrep_on              = ON
wsrep_provider        = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_name    = "sql-gc01"
wsrep_cluster_address = "gcomm://10.49.97.152,10.49.97.153"
wsrep_node_address    = "10.49.97.152"
wsrep_sst_method      = rsync
ignore-db-dir         = .snapshot
 
 
[mysqld]
####################################
# mysql/mariadb settings
 
binlog_format            = ROW
default-storage-engine   = innodb
innodb_autoinc_lock_mode = 2
innodb_file_per_table    = on
binlog-row-image         = minimal
 
innodb_doublewrite  = 1
 
####################################
# galera settings
 
wsrep_on              = ON
wsrep_provider        = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_name    = "sql-gc01"
wsrep_cluster_address = "gcomm://10.49.97.152,10.49.97.153"
wsrep_node_address    = "10.49.97.153"
wsrep_sst_method      = rsync
ignore-db-dir         = .snapshot

Generated at Thu Feb 08 10:30:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.