[MDEV-29048] Table create statement on master does not get replicated on replica Created: 2022-07-06  Updated: 2022-07-25  Resolved: 2022-07-25

Status: Closed
Project: MariaDB Server
Component/s: Replication
Affects Version/s: 10.6.7, 10.7.3, 10.7.4, 10.8.3
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Erlandas Assignee: Unassigned
Resolution: Not a Bug Votes: 0
Labels: replication
Environment:

Ubuntu 20.04



 Description   

Simple CREATE TABLE (or a column in the table) statement used to create a new table in the existing database does not get replicated on replica. Sometimes it works from some MySQL DBA clients but usually it is an issue from CLI, DataGrip and other DBA.

Replica then issues:

Last_Errno: 1146
Last_Error: Error 'Table 'jobs2.queued_tasks_slots' doesn't exist' on query. Default database: 'jobs2'. Query: 'INSERT INTO jobs2.queued_tasks_slots (task_name, slots_used, updated_at) VALUES ('Insert_jobs_batch_into_indexes', DEFAULT, DEFAULT)'
Slave_IO_Running: Yes
Slave_SQL_Running: No

I have been having this issue since 10.6.7 when I started using Master-> Replica setup

In the past I was getting 1054 error which is unknown column name (when I was adding column to the existing table but I see the same problem is with new table addition too), now with 10.8.3 I am getting 1146 error but the cause is always the same - create a table with columns or a new column on master and replication then breaks on replica when master gets writes into newly created column/table.

The only solution then is to import the whole database again on replica from master dump and start replication again.

Replica config as follows:

[server]
 
# this is only for the mysqld standalone daemon
[mysqld]
 
#
# * Basic Settings
#
 
user                    = mysql
pid-file                = /run/mysqld/mysqld.pid
basedir                 = /usr
datadir                 = /var/lib/mysql
lc-messages-dir         = /usr/share/mysql
lc-messages             = en_US
skip-external-locking
 
#Replication Settings
read_only=1
server_id=14
expire_logs_days=2
gtid_strict_mode=1
#Use the two lines below to get over duplicate primary key entries errors and deletion of non existent keys or when some table did not get ALTER replicated
##slave-skip-errors=1062,1032
##slave-skip-errors=1054,1091
##skip-slave-start
slave_transaction_retry_interval=2
slave_transaction_retries=30
replicate_do_db=cms
replicate_do_db=lgeonames
replicate_do_db=jobs2
replicate_do_db=osticket
replicate_do_db=phpbt
replicate_do_db=phpcrwl
replicate_do_db=schools2
slave_domain_parallel_threads=32
slave_parallel_threads=100
slave_parallel_max_queued=524288
email_db.slave_parallel_mode=conservative
web1_db.slave_parallel_mode=conservative
web2_db.slave_parallel_mode=conservative
 
performance-schema=0
sql_mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION"
open_files_limit = 240000
log-error=/var/lib/mysql/errors.log
tmpdir = /var/lib/mysql/tmp
innodb_file_per_table=1
innodb_defragment=1
default-storage-engine=InnoDB
innodb_flush_method=O_DIRECT
innodb_buffer_pool_size=180G
innodb_change_buffering=all
innodb_io_capacity = 200
innodb_read_io_threads = 64
innodb_write_io_threads = 64
innodb_log_file_size=128M
innodb_log_buffer_size =64M
max_connections=1600
key_buffer_size = 8M
max_allowed_packet=512M
#slow_query_log                          = 1
#slow_query_log_file                     = /var/lib/mysql/slow.log
#long_query_time                         = 1
#log_queries_not_using_indexes=1
wait_timeout=120
innodb_lock_wait_timeout=60
lock_wait_timeout = 50
interactive_timeout=120
read_buffer_size = 128K
table_definition_cache = 6400
table_open_cache = 12000
max_heap_table_size = 128M
tmp_table_size = 128M
join_buffer_size = 256K
sort_buffer_size = 256K
skip-name-resolve=1
query_cache_type = 0
query_cache_size = 0
innodb_buffer_pool_dump_pct=75
innodb_buffer_pool_dump_at_shutdown=ON
innodb_buffer_pool_load_at_startup=ON
innodb_flush_log_at_trx_commit =1



 Comments   
Comment by Angelique Sklavounos (Inactive) [ 2022-07-09 ]

Hi Erland22, thank you for the bug report. Unfortunately I was not able to reproduce this. I tried with 10.6.7 and the mariadb/mysql client, using the following commands:

create database jobs2;
create table jobs2.queued_tasks_slots (task_name varchar(255), slots_used int DEFAULT 61);
insert into jobs2.queued_tasks_slots (task_name) values ('task_1');
alter table jobs2.queued_tasks_slots add column (updated_at timestamp default now());

1) Do these statements (particularly the Alter) match what you tried?
2) Could you also please provide the .cnf for the primary server, and report what show create table shows for jobs2.queued_tasks_slots?

Comment by Erlandas [ 2022-07-13 ]

Hello Angelique,

This is the exact statement I used to create the table (there was no ALTER statement, just create table and it did not get replicated to Replica):

create table jobs2.queued_tasks_slots
(
    task_name  varchar(100)                                  not null,
    slots_used smallint unsigned default 0                   not null,
    updated_at datetime          default current_timestamp() not null
on update current_timestamp(),
    constraint task_name
        unique (task_name)
);

jobs2 database already had other tables and data before creating `queued_tasks_slots` table. I added table with columns (which did not get replicated) not just column to the database (Sorry about the confusion).

Master has /etc/mysql/mariadb.conf.d/server.cnf :

# this is read by the standalone daemon and embedded servers
[server]
 
# this is only for the mysqld standalone daemon
[mysqld]
 
#
# * Basic Settings
#
 
user                    = mysql
pid-file                = /run/mysqld/mysqld.pid
basedir                 = /usr
datadir                 = /var/lib/mysql
lc-messages-dir         = /usr/share/mysql
lc-messages             = en_US
skip-external-locking
 
 
#Replication Settings
log-bin
server_id=12
expire_logs_days=2
gtid_strict_mode=1
 
gtid_domain_id=12
log-basename=master12
binlog_format=mixed
 
replicate_do_db=jobs2
 
performance-schema=0
sql_mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION"
open_files_limit = 240000
log-error=/var/lib/mysql/errors.log
tmpdir = /var/lib/mysql/tmp
default-storage-engine=InnoDB
innodb_file_per_table=1
innodb_defragment=1
innodb_flush_method=O_DIRECT
innodb_buffer_pool_size=180G
innodb_change_buffering=all
innodb_log_file_size=384M
innodb_log_buffer_size =192M
innodb_io_capacity = 200
innodb_read_io_threads = 64
innodb_write_io_threads = 64
max_connections=1600
key_buffer_size = 10M
max_allowed_packet=512M
#slow_query_log= 1
#slow_query_log_file= /var/lib/mysql/slow.log
#long_query_time= 1
#log_queries_not_using_indexes=1
wait_timeout=120
interactive_timeout=120
innodb_lock_wait_timeout=60
lock_wait_timeout = 50
read_buffer_size = 128K
table_definition_cache = 6400
table_open_cache = 12000
max_heap_table_size = 128M
tmp_table_size = 128M
join_buffer_size = 256K
sort_buffer_size = 256K
 
skip-name-resolve=1
query_cache_type = 0
query_cache_size = 0
innodb_buffer_pool_dump_pct=75
innodb_buffer_pool_dump_at_shutdown=ON
innodb_buffer_pool_load_at_startup=ON
sync_binlog=0
 
 
[mysqldump]
max_allowed_packet=512M

Comment by Angelique Sklavounos (Inactive) [ 2022-07-13 ]

Great, thanks Erland22, I can see the behavior on 10.6.7. The problem appears to be with replicate_do_db (admittedly I left this out the first time, a bad assumption on my part) and if a default database is not set. If I clear the replicate_do_db filter, then the table replicates.

However, with the filter set, the table will replicate if it is preceded by USE jobs2.

So I would just like to verify with you that the CREATE TABLE statement below was NOT preceded by USE jobs2. From your comment:

This is the exact statement I used to create the table (there was no ALTER statement, just create table and it did not get replicated to Replica):

create table jobs2.queued_tasks_slots
(
task_name varchar(100) not null,
slots_used smallint unsigned default 0 not null,
updated_at datetime default current_timestamp() not null
on update current_timestamp(),
constraint task_name
unique (task_name)
);

In other words, with replicate_do_db=jobs2:
Table not replicated

CREATE DATABASE jobs2;
CREATE TABLE jobs2.t1 (a int); #not replicated
USE jobs2;
INSERT INTO jobs2.t1 VALUES (4); #1146 error on replica

Table replicated

CREATE DATABASE jobs2;
USE jobs2;
CREATE TABLE jobs2.t1 (a int); #replicated
INSERT INTO jobs2.t1 VALUES (4); #no error on replica

Also, with the following, the table is not replicated, but also the 1146 error does not appear:
Table not replicated and no error

CREATE DATABASE jobs2;
CREATE TABLE jobs2.t1 (a int); #not replicated
INSERT INTO jobs2.t1 VALUES (4); #no 1146 error

The only solution then is to import the whole database again on replica from master dump and start replication again.

I was curious about this, so ran mysqldump on my test server using --all-databases, and can see that after jobs2 is created, it is set as the default with USE.

Comment by Erlandas [ 2022-07-13 ]

I'm glad to hear that you were able to get the problem repeated on your side. The problem is on 10.6.7 and 10.7.3 and 10.8.3 - I have tried all of those and was getting the same issue.

I use replicate_do_db=jobs2 to specify which exact database to replicate (and other databases on other servers as well as I have multiple masters replicating into one replica. And I have multiple databases on each server but i need only specific databases to be replicated)

I did not state `USE jobs2;` as the statement was executed via admin panel (where database was already selected).

But when I try via CLI - yes, i issue `USE jobs2;` but as you mentioned - the problem comes from `replicate_do_db=jobs2` setting. I was having same issue most of the time in CLI as well (sometimes somehow it miraculously did replicate when executed from CLI but usually not)

And what I mean admin panel is - PHP Adminer or Data Grip or some other application which allows working with MariaDB/MySQL without going to Command Line Interface. In the admin panel you just select database which corresponds the `use jobs2;` in CLI.

Comment by Angelique Sklavounos (Inactive) [ 2022-07-14 ]

Hi Erland22, unfortunately then it looks like I am NOT reproducing what you are seeing, but I was testing with only a single master.

But when I try via CLI - yes, i issue `USE jobs2;` but as you mentioned - the problem comes from `replicate_do_db=jobs2` setting. I was having same issue most of the time in CLI as well (sometimes somehow it miraculously did replicate when executed from CLI but usually not)

As I said:

However, with the filter set, the table will replicate if it is preceded by USE jobs2.

and

Table replicated

CREATE DATABASE jobs2;
USE jobs2;
CREATE TABLE jobs2.t1 (a int); #replicated
INSERT INTO jobs2.t1 VALUES (4); #no error on replica

I use replicate_do_db=jobs2 to specify which exact database to replicate (and other databases on other servers as well as I have multiple masters replicating into one replica. And I have multiple databases on each server but i need only specific databases to be replicated)

I will test with multiple masters, but if you are seeing the issue with a single master, do you have SQL logs, error logs, output from SHOW SLAVE STATUS, output from SHOW VARIABLES in the case when it did not replicate and CLI was being used?

Comment by Angelique Sklavounos (Inactive) [ 2022-07-14 ]

I use replicate_do_db=jobs2 to specify which exact database to replicate (and other databases on other servers as well as I have multiple masters replicating into one replica. And I have multiple databases on each server but i need only specific databases to be replicated)

But when I try via CLI - yes, i issue `USE jobs2;` but as you mentioned - the problem comes from `replicate_do_db=jobs2` setting. I was having same issue most of the time in CLI as well (sometimes somehow it miraculously did replicate when executed from CLI but usually not)

This might be useful to consult (and might explain why the issue is seen most of the time but not always):
https://mariadb.com/kb/en/multi-source-replication/#replication-variables-for-multi-source

Comment by Erlandas [ 2022-07-14 ]

Yes, I see that error logs is there for Replica server and error logs are present for the master. I have not tried with single master as it was setup with multiple masters to replicate to single replica.

Where would SQL log be located? I see Binlogs but I cannot see SQL log file.
As for SHOW SLAVE STATUS, output from SHOW VARIABLES - I can get by purposefully breaking replication with new table creation.

Where can I upload error log and other logs privately? They would be around 10-15 Mb each when Zipped.

Yes, I see 'replicate_do_db' limitation on the URL you have shown. So what setup would be best in my scenario when I need to replicate from multiple masters, only specific databases to single replica?

And how does it replicate other statements when we do not specify USE XXXX; - replicated just row by row?

Comment by Erlandas [ 2022-07-14 ]

Yes, I see that error logs is there for Replica server and error logs are present for the master. I have not tried with single master as it was setup with multiple masters to replicate to single replica.

Where would SQL log be located? I see Binlogs but I cannot see SQL log file.
As for SHOW SLAVE STATUS, output from SHOW VARIABLES - I can get by purposefully breaking replication with new table creation.

Where can I upload error log and other logs privately? They would be around 10-15 Mb each when Zipped.

Yes, I see 'replicate_do_db' limitation on the URL you have shown. So what setup would be best in my scenario when I need to replicate from multiple masters, only specific databases to single replica?

And how does it replicate other statements when we do not specify USE XXXX; - replicated just row by row?

Comment by Erlandas [ 2022-07-14 ]

I now fully understood what you meant in previous messages about using `USE` before CREATE TABLE. MariaDB replicate_do_db setting is simply designed to work that way with statement based replication which to me looks error prone requirement which basically is not really necessary. When we look at the currently required usage to avoid error:

Table replicated work this way:

CREATE DATABASE jobs2;
USE jobs2;
CREATE TABLE jobs2.t1 (a int); #replicated
INSERT INTO jobs2.t1 VALUES (4); #no error on replica

But in fact - we already tell which database to use with CREATE statement.

`CREATE TABLE jobs2.t1` - already says - we are working with jobs2 database. No other database is involved. So why is `USE jobs2;` is so necessary?

I understand it is necessary if I issue such statement:

`CREATE TABLE t1;` - then surely - MariaDB does not know on which database to create the table but if I am executing `CREATE TABLE jobs2.t1` - issuing `USE jobs2;` before it - is completely unnecessary.

Comment by Elena Stepanova [ 2022-07-25 ]

As of now, it is not the question of necessity, historically it was designed and implemented this way. True enough, it is confusing to many users, which is the reason why it is very verbosely documented both in MySQL manual and in MariaDB KB.

You are very welcome to create a feature request with suggestions how the behavior could be changed. Meanwhile, you can explore replicate-wild-do-table, it may be more to your liking.

Generated at Thu Feb 08 10:05:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.