[MDEV-19530] Failing DROP TABLE replicated in Galera Created: 2019-05-20  Updated: 2019-12-12  Resolved: 2019-12-12

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.3.10
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Claudio Nanni Assignee: Jan Lindström (Inactive)
Resolution: Not a Bug Votes: 2
Labels: None

Attachments: File 190915_mysqld.1.err     Zip Archive MDEV-19530_190915_logs.zip     File galera_mdev_19530.result     File galera_mdev_19530.test    

 Description   

A DROP TABLE that fails on a node due to missing table is replicated to other nodes.

In this example I remove the table locally on node 10310, and then again with replication on.
It fails on 10310 since the table is missing, but it's replicated to other nodes.

A similar bug was fixed long ago but that was failing due to missing permissions:

https://jira.mariadb.org/browse/MDEV-8598

Not sure how important is this specific bug since the case shows an inconsistent state of the nodes but on the other hand, that a failing command is replicated is weird too and it could be the tip of an iceberg.

# mh cli 10310
 
MariaDB [(none)]> create database db99;
Query OK, 1 row affected (0.019 sec)
 
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> create table t1 (id int);
Query OK, 0 rows affected (0.037 sec)
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.001 sec)
 
# mh cli 20310
 
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.001 sec)
 
MariaDB [db99]> Bye
 
# mh cli 30310
 
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.000 sec)
 
MariaDB [db99]> Bye
 
# mh cli 10310
 
MariaDB [(none)]> use db99;
 
MariaDB [db99]> set wsrep_on=off;
 
MariaDB [db99]> drop table t1;
Query OK, 0 rows affected (0.015 sec)
 
MariaDB [db99]> show tables;
Empty set (0.001 sec)
 
MariaDB [db99]> Bye
 
# mh cli 20310
 
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.001 sec)
 
MariaDB [db99]> Bye
 
# mh cli 30310
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.001 sec)
 
MariaDB [db99]> Bye
 
# mh cli 10310
 
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> show tables;
Empty set (0.000 sec)
 
MariaDB [db99]> drop table t1;
ERROR 1051 (42S02): Unknown table 'db99.t1'
 
MariaDB [db99]> Bye
 
# mh cli 20310
 
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> show tables;
Empty set (0.000 sec)
 
MariaDB [db99]> Bye
# mh cli 30310
 
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> show tables;
Empty set (0.001 sec)
 
MariaDB [db99]> 



 Comments   
Comment by Claudio Nanni [ 2019-05-27 ]

Wanted to add, after some discussions, that Galera by definition shoots DDLs on all nodes and it has no way to control success on each node.
This case reported here, while in a way follows that principle, in another it's counter intuitive because the DDLs fails directly on the node, so it's likely going to create a difference if executed on the other nodes, but still sends the DDL to the other nodes.
Said in another way, a DDL that fails on the same node where executed is the easiest case for knowing the DDL will fail on at least one node and we can do something to stop it.
Maybe it's a matter of semantic (of DDLs in Galera).

Comment by Jan Lindström (Inactive) [ 2019-09-05 ]

As you did not include error logs from all nodes, it is not clear to me is that first node still part of the cluster or not. When that node rejoins the cluster, nodes should detect that database state is inconsistent and nodes detecting the inconsistency should do shutdown.

Comment by Claudio Nanni [ 2019-09-09 ]

Hi.
The cluster status does not change and keeps going on without any problem until that table will be used.
The cluster nodes don't have any idea that a DDL failed (and this is by design), DDL is shot on all nodes but there is no feedback mechanism (it's not transactional) so it can fail on any node and the others won't know. The weird thing in this case is that if the DDL fails on the node where it is run it's pretty clear that it should not be sent to the other nodes. Probably this is only a "formal" consideration because in practice on nodes that are in sync the DDL should fail on all the nodes.

Comment by Stepan Patryshev (Inactive) [ 2019-09-15 ]

I've managed to reproduce it, but naturally only if "set wsrep_on=on;" is run before the second run of "drop table t1;".

All binaries are non-debug built from sources:
MariaDB Server 10.3.19-MariaDB-debug: branch 10.3, commit bfbf0f225179d79c1c568b93ddc8b5dd6a670072.
Galera Lib3: branch mariadb-3.x, commit d4f27ab85710767c8bb86a973bc244c0a24b0f0d.

See 190915_mysqld.1.err and all other attached logs: MDEV-19530_190915_logs.zip.

N1:
 
MariaDB [(none)]> create database db99;
Query OK, 1 row affected (0.032 sec)
 
MariaDB [(none)]> use db99;
Database changed
 
MariaDB [db99]> create table t1 (id int);
Query OK, 0 rows affected (0.027 sec)
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.002 sec)
 
====================
 
N2:
 
MariaDB [(none)]> use db99;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Database changed
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.002 sec)
 
====================
 
N3:
 
MariaDB [(none)]> use db99;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Database changed
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.002 sec)
 
====================
 
N1:
 
MariaDB [db99]> set wsrep_on=off;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [db99]> show variables like "wsrep_on";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| wsrep_on      | OFF   |
+---------------+-------+
 
 drop table t1;
Query OK, 0 rows affected (0.018 sec)
 
MariaDB [db99]> show tables;
Empty set (0.001 sec)
 
====================
 
N2:
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.002 sec)
 
====================
 
N3:
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.001 sec)
 
====================
 
N1:
 
MariaDB [db99]> drop table t1;
ERROR 1051 (42S02): Unknown table 'db99.t1'
 
====================
 
N2:
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.002 sec)
 
====================
 
N3:
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.001 sec)
 
====================
 
N1:
 
MariaDB [db99]> set wsrep_on=on;
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [db99]> show variables like "wsrep_on";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| wsrep_on      | ON    |
+---------------+-------+
1 row in set (0.003 sec)
 
====================
 
N2:
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.002 sec)
 
====================
 
N3:
 
MariaDB [db99]> show tables;
+----------------+
| Tables_in_db99 |
+----------------+
| t1             |
+----------------+
1 row in set (0.001 sec)
 
====================
 
N1:
 
MariaDB [db99]> drop table t1;
ERROR 1051 (42S02): Unknown table 'db99.t1'
 
====================
 
N2:
 
MariaDB [db99]> show tables;
Empty set (0.002 sec)
 
====================
 
N3:
 
MariaDB [db99]> show tables;
Empty set (0.002 sec)

Comment by Stepan Patryshev (Inactive) [ 2019-09-16 ]

I've created an MTR for this defect: galera_mdev_19530.test and galera_mdev_19530.result.

Comment by Jan Lindström (Inactive) [ 2019-09-18 ]

I think here expectations do not meet the actual design. In TOI DDL-statement (here DROP TABLE) is replicated to all nodes in the cluster before it is executed on any of the nodes. If you do set wsrep_on=OFF you indicate you know what you are doing i.e. you are executing operations that effect only one node and if not careful they could cause nodes to be out of sync. If you really want to do this I suggest using RSU.

Comment by Claudio Nanni [ 2019-10-28 ]

jplindst Indeed TOI DDL-statement are replicated to all nodes before execution (or disregarding execution on local node) and this is a known behaviour indeed, but this also reminds me that Galera is limitated if the cluster can't confirm successful execution on all nodes, who knows if this can be improved somehow in the future (having some basic DDL sanity checks?).

NOTE: set wsrep_on=OFF was used in the example only to create the inconsistent state (missing table on only one node), but that's not needed, any DDL that fails on the node is replicated to the others no matter what. While we understand that this happens due to Galera design, any regular user is probably surprised by that.

Generated at Thu Feb 08 08:52:24 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.