Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.2.27
-
None
Description
Suppose there is a 3 node mariadb galera cluster and an async replicated node connected to it. The galera cluster has a partitioned table that you wish to keep only for a few weeks, while the slave has several months of data in it. Once in a while you have to drop the partitions in the galera cluster without affecting the async slave.
On the load-balancer (haproxy): disable traffic to galera1 node.
galera1 node)
SET GLOBAL wsrep_OSU_method='RSU';
SET SESSION sql_log_bin = 0;
ALTER TABLE db.table DROP PARTITION 201909;
ALTER TABLE db.table DROP PARTITION 201910;
ALTER TABLE db.table DROP PARTITION 201911;
And so on for all galera nodes. This way the galera nodes will have the 201909-201911 partitions dropped, while the slave will have the table as before.
In some cases the DROP PARTITION statement causes the mariadb process to hang, the operation doesn't end and a new connection cannot be made to the server. Server version 10.2.27. There is nothing in the error log.
Galera nodes have the following configuration:
ignore_db_dirs = lost+found
innodb_buffer_pool_size = 120G
innodb_flush_log_at_trx_commit = 1
max_allowed_packet = 60M
table_open_cache = 800
wsrep_on=ON
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
query_cache_size=0
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=20G; gcache.page_size=20G; gcs.fc_limit=256; gcs.fc_factor=0.99; gcs.fc_master_slave=YES"
wsrep-node-name=node1
wsrep_node_address=1.1.1.1
wsrep_cluster_name=cluster
wsrep_cluster_address="gcomm://1.1.1.1,1.1.1.2,1.1.1.3"
wsrep_sst_method=mariabackup
wsrep_sst_auth=user:pass
wsrep_slave_threads=24
binlog_checksum = CRC32
innodb_checksum_algorithm = crc32
innodb_strict_mode = 0
log_slave_updates = 1
master_verify_checksum = 1
max_connect_errors = 500
max_connections = 1500
max_heap_table_size = 1G
slave_parallel_threads = 20
slave_sql_verify_checksum = 1
sql_mode = "NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
sync_binlog = 1
tmp_table_size = 1G
This also happened in version 10.1, but it does not happen all the time. If it happens, only sending kill -9 to mariadb works to stop it.