[MDEV-11817] Altering a table with more rows than wsrep_max_ws_rows causes cluster to break when running Galera cluster in TOI mode Created: 2017-01-16  Updated: 2017-02-01  Resolved: 2017-02-01

Status: Closed
Project: MariaDB Server
Component/s: Galera
Affects Version/s: 10.1.20
Fix Version/s: 10.1.22

Type: Bug Priority: Major
Reporter: Joseph Palermo Assignee: Nirbhay Choubey (Inactive)
Resolution: Fixed Votes: 0
Labels: galera
Environment:

Ubuntu Trusty. Galera 25.3.17


Sprint: 10.2.4-1, 10.2.4-2

 Description   

Reproduce steps:
1. Create a 3 node galera cluster
2. Set wsrep_max_ws_rows to a non-zero value
3. Create a table with more than wsrep_max_ws_rows in it
4. Perform an alter table statement on that table. It will get rejected with:
ERROR 1180 (HY000): wsrep_max_ws_rows exceeded done
5. The node you were connected to will still have the old table schema, but the other nodes have successfully applied the alter table changes.

At that point any data written to that table will cause the other nodes to terminate due to inconsistency.

Codership was able to verify the issue, but say it is a MariaDB specific bug:
https://groups.google.com/forum/#!topic/codership-team/TDscBC_oAk4



 Comments   
Comment by Philip Stoev (Inactive) [ 2017-01-16 ]

In MariaDB, copy_data_between_tables uses handler::ha_write_row, which in turn has a check for wsrep_max_ws_rows. This check does not exist in Galera Cluster.

Breakpoint 1, handler::ha_write_row (this=0x7f0ac2cc7020, buf=0x7f0ac2c53420 "\375\061    ") at /home/buildbot/buildbot/build/sql/handler.cc:5926
5926      DBUG_RETURN(check_wsrep_max_ws_rows());
(gdb) bt
#0  handler::ha_write_row (this=0x7f0ac2cc7020, buf=0x7f0ac2c53420 "\375\061    ") at /home/buildbot/buildbot/build/sql/handler.cc:5926
#1  0x000000000063c5ce in copy_data_between_tables (thd=0x7f0acfe1f008, from=0x7f0ac2d34808, to=0x7f0ac2c62008, create=..., ignore=false, order_num=0, order=0x0, copied=0x7f0ad7ffa048,
    deleted=0x7f0ad7ffa040, keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x7f0ad7ff88e0) at /home/buildbot/buildbot/build/sql/sql_table.cc:9547
#2  0x0000000000644912 in mysql_alter_table (thd=0x7f0acfe1f008, new_db=0x7f0ac2ca2718 "test", new_name=<optimized out>, create_info=0x7f0ad7ffa6c0, table_list=0x7f0ac2ca2128,
    alter_info=0x7f0ad7ffa7f0, order_num=0, order=0x0, ignore=false) at /home/buildbot/buildbot/build/sql/sql_table.cc:9047
#3  0x000000000069358e in Sql_cmd_alter_table::execute (this=<optimized out>, thd=0x7f0acfe1f008) at /home/buildbot/buildbot/build/sql/sql_alter.cc:325
#4  0x00000000005b3ae0 in mysql_execute_command (thd=0x7f0acfe1f008) at /home/buildbot/buildbot/build/sql/sql_parse.cc:5671
#5  0x00000000005bc074 in mysql_parse (thd=0x7f0acfe1f008, rawbuf=<optimized out>, length=<optimized out>, parser_state=0x7f0ad7ffc430) at /home/buildbot/buildbot/build/sql/sql_parse.cc:7319
#6  0x00000000005bc108 in wsrep_mysql_parse (thd=0x7f0acfe1f008, rawbuf=0x7f0ac2ca2020 "alter table t1 change column f1 f1 char(5)", length=42, parser_state=0x7f0ad7ffc430)
    at /home/buildbot/buildbot/build/sql/sql_parse.cc:7142
#7  0x00000000005be1e5 in dispatch_command (command=COM_QUERY, thd=0x7f0acfe1f008, packet=0x7f0acfe30009 "alter table t1 change column f1 f1 char(5)", packet_length=42)
    at /home/buildbot/buildbot/build/sql/sql_parse.cc:1485
#8  0x00000000005bf4b7 in do_command (thd=0x7f0acfe1f008) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1108
#9  0x000000000068ee33 in do_handle_one_connection (thd_arg=<optimized out>) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1350
#10 0x000000000068f052 in handle_one_connection (arg=0x7f0acfe1f008) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1262
#11 0x0000000000b10ba9 in pfs_spawn_thread (arg=<optimized out>) at /home/buildbot/buildbot/build/storage/perfschema/pfs.cc:1860
#12 0x0000003be66076ca in start_thread () from /lib64/libpthread.so.0
#13 0x0000003be5f0779d in clone () from /lib64/libc.so.6

Comment by Nirbhay Choubey (Inactive) [ 2017-01-24 ]

philip-galera hmm? IIUC, wsrep-5.6 should have this problem too.
https://github.com/codership/mysql-wsrep/blob/5.6/sql/sql_table.cc#L9170

Comment by Nirbhay Choubey (Inactive) [ 2017-01-24 ]

http://lists.askmonty.org/pipermail/commits/2017-January/010498.html

Comment by Philip Stoev (Inactive) [ 2017-01-24 ]

I do not think mysql-wsrep has this line:

DBUG_RETURN(check_wsrep_max_ws_rows());

Comment by Nirbhay Choubey (Inactive) [ 2017-01-25 ]

philip-galera check_wsrep_max_ws_rows() is a wrapper in maria for code that's getting
repeated in ha_insert|update|delete_row(). So, conceptually it's same.

Generated at Thu Feb 08 07:52:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.