[MDEV-17705] Review InnoDB changes on Galera 4 wsrep patch Created: 2018-11-14  Updated: 2019-03-20  Resolved: 2019-01-17

Status: Closed
Project: MariaDB Server
Component/s: Galera, Storage Engine - InnoDB
Fix Version/s: 10.4.2

Type: Task Priority: Critical
Reporter: Jan Lindström (Inactive) Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: galera_4

Issue Links:
Blocks
blocks MDEV-16405 Merge Galera 4 changes from mariaDB_w... Closed

 Description   

Related to:

https://github.com/codership/mariaDB-wsrep/issues/49



 Comments   
Comment by Marko Mäkelä [ 2018-11-14 ]

Currently, the latest code seems to be bb-10.4-galera4.
The following command is generating 1,359 lines of output for me:

git diff 8dffa7eaf67cec01b91f521cf49447a34936ab92{^2,} storage/innobase extra/mariabackup

I have requested that the functions get_wsrep_xid_uuid() and get_wsrep_xid_seqno() be renamed back, to lose the get_ prefix. That would reduce the unnecessary diff.

I also see that some occurrences of the function wsrep_must_process_fk() have been modified. Why is that? Can that function not be modified instead? And why would we pay an extra penalty for the "Galera is disabled" case by putting the expensive check upfront? Here are a couple examples that the above command is producing for me:

@@ -2517,8 +2524,11 @@ row_upd_sec_index_entry(
 			}
 #ifdef WITH_WSREP
 			if (!referenced && foreign
-			    && wsrep_must_process_fk(node, trx)
-			    && !wsrep_thd_is_BF(trx->mysql_thd, FALSE)) {
+                            && wsrep_on(trx->mysql_thd)
+			    && !wsrep_thd_is_BF(trx->mysql_thd, FALSE)
+			    && (!parent || que_node_get_type(parent) != QUE_NODE_UPDATE ||
+				!parent->cascade_node)
+			) {
 
 				ulint*	offsets = rec_get_offsets(
 					rec, index, NULL, true,

Why was this occurrence of wsrep_must_process_fk() replaced, but others were preserved?

@@ -3064,7 +3075,8 @@ row_upd_del_mark_clust_rec(
 		err = row_upd_check_references_constraints(
 			node, pcur, index->table, index, offsets, thr, mtr);
 #ifdef WITH_WSREP
-	} else if (foreign && wsrep_must_process_fk(node, trx)) {
+	} else if (trx && wsrep_on(trx->mysql_thd) && err == DB_SUCCESS
+                   && wsrep_must_process_fk(node, trx)) {
 		err = wsrep_row_upd_check_foreign_constraints(
 			node, pcur, index->table, index, offsets, thr, mtr);
 

Why do we evaluate wsrep_on() before evaluating simpler conditions?
Why check for trx == NULL? The variable was dereferenced already at the start of the function for the expression trx->id, so it cannot possibly be NULL.
Why check for err == DB_SUCCESS? That condition always holds on this code path.
Why was the check for foreign removed? If that parameter is no longer needed, then various function parameters should be removed as well:

#ifdef WITH_WSREP
	bool		foreign,/*!< in: whether this is a foreign key */
#endif

Also, currently the called function is defined in an inefficient way:

inline bool wsrep_must_process_fk(const upd_node_t* node, const trx_t* trx)
{
	if (!wsrep_on_trx(trx)) {
		return false;
	}
	return que_node_get_type(node->common.parent) != QUE_NODE_UPDATE
		|| static_cast<upd_node_t*>(node->common.parent)->cascade_node
		!= node;
}

I think that the call to wsrep_on_trx(trx) should be last, that is, something like this:

inline bool wsrep_must_process_fk(const upd_node_t* node, const trx_t* trx)
{
	return (que_node_get_type(node->common.parent) != QUE_NODE_UPDATE
		|| static_cast<upd_node_t*>(node->common.parent)->cascade_node
		!= node)
		&& wsrep_on_trx(trx);
}

(This change should be applied to 10.1 already.) These two forms should be equivalent, but the call wsrep_on((trx)->mysql_thd)) a.k.a. wsrep_service->wsrep_on_func(trx->mysql_thd) is moved to the end.

jplindst, please notify me when the above issues have been addressed. I would like to review the remaining InnoDB changes.

Comment by Jan Lindström (Inactive) [ 2018-12-19 ]

Above changes and additional unnecessary changes are addressed.

Comment by Jan Lindström (Inactive) [ 2018-12-19 ]

in branch bb-10.4-galera4

Comment by Marko Mäkelä [ 2018-12-28 ]

I can still see changes that do not make sense to me. Here is an example:

git diff 555921a9c3d..49e41bfb50769696ff594cd439ebe3599abac984 storage/innobase/trx/trx0trx.cc 
diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc
index d4f2d0b1b00..bf6edcfdca2 100644
--- a/storage/innobase/trx/trx0trx.cc
+++ b/storage/innobase/trx/trx0trx.cc
@@ -1448,6 +1448,7 @@ trx_commit_in_memory(
 	if (trx->mysql_thd && wsrep_on(trx->mysql_thd)) {
 		trx->lock.was_chosen_as_deadlock_victim = FALSE;
 	}
+	trx->lock.was_chosen_as_wsrep_victim = FALSE;
 #endif
 
 	DBUG_LOG("trx", "Commit in memory: " << trx);

Why was the preceding conditional assignment not removed?
[EDIT: Sorry, bad example! The two long variable names are not identical!]

Also, it seems to me that the latest merge from 10.4 was more than 1 month ago (November 20). Please merge more often, and read and minimize the diff yourself, before requesting a new review.

Why are the following files added or changed?

mysql-test/include/check-testcase.test
mysql-test/include/galera_cluster.inc
mysql-test/include/galera_wait_sync_point.inc
mysql-test/include/have_ipv6.inc
mysql-test/include/kill_galera.inc
mysql-test/include/mtr_warnings.sql
mysql-test/include/restart_mysqld.inc
mysql-test/include/wait_until_connected_again.inc
mysql-test/include/wsrep_wait_disconnect.inc
mysql-test/mysql-test-run.pl
mysql-test/r/mysqld--help-notwin.result
mysql-test/r/query_cache_size_functionality.result
mysql-test/r/query_cache_type_functionality.result
mysql-test/suite/innodb/r/innodb-index-online-fk.result
mysql-test/suite/innodb/t/galera.skip
mysql-test/suite/innodb/t/innodb-index-online-fk.test

The added file mysql-test/include/have_ipv6.inc appears to (almost) duplicate the existing file mysql-test/suite/galera_3nodes/include/have_ipv6.inc.

I do not think that any subsystem-specific files should be added to mysql-test/include. They belong in the test suites of the subsystems.

Comment by Jan Lindström (Inactive) [ 2018-12-28 ]

We have more recent 10.4 merge on regression testing as MDL-locks were refactored so that current Galera implementation does not work anymore. Some of those includes are used on several suites thus they are placed main one. I will go through them and see if some of them can be moved. Those files are changed as changes are required, I will double check them but some of those changes need to be there. Anyway, in this MDEV I will try to concentrate InnoDB only changes. There is also MDEV for server changes.

Comment by Marko Mäkelä [ 2018-12-28 ]

For example, suite/innodb/include/wait_all_purged.inc can be included from InnoDB-specific tests in other test suites just fine.
If files really have to be moved, then they should be moved, not duplicated. I have no problem if the file have_ipv6.inc is moved to suite/galera/include/have_ipv6.inc.

Comment by Jan Lindström (Inactive) [ 2019-01-14 ]

commit f5769e497dd1d1b7355b92a0c51970a94726025c
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Mon Jan 14 11:27:54 2019 +0200

MDEV-17705: Review InnoDB changes on Galera 4 wsrep patch

Remove unnecessary changes and fix coding style.
modified: mysql-test/include/mtr_warnings.sql
modified: mysql-test/suite/innodb/r/innodb-index-online-fk.result
modified: mysql-test/suite/innodb/t/innodb-index-online-fk.test
modified: storage/innobase/handler/ha_innodb.cc
modified: storage/innobase/include/trx0trx.h
modified: storage/innobase/row/row0sel.cc
modified: storage/innobase/trx/trx0roll.cc
modified: storage/innobase/trx/trx0rseg.cc
modified: storage/innobase/trx/trx0trx.cc

Comment by Marko Mäkelä [ 2019-01-14 ]

Thank you for the merge.
I quickly went through the output of the following:

git diff e7924a8598c..origin/bb-10.4-galera4

Here are some highlights:

  • trx_rseg_update_wsrep_checkpoint() could check xid_seqno!=1 before invoking memcmp().
  • trx0trx.cc and trx0roll.cc are adding some code outside of #ifdef WITH_WSREP, which seems unacceptable to me.
  • row0sel.cc adds a wrong #include without enclosing it in #ifdef WITH_WSREP. It looks like it should be #include "mysql/service_wsrep.h".
  • There are some added lines with space before TAB. (These are highlighted by git show.)
  • When it comes to files in `mysql-test`, there is at least one white-space-only change to mysql-test/include/mtr_warnings.sql, which I think should be avoided.
    Also, mysql-test/suite/innodb/r/innodb-index-online-fk.test contains changes that I think should be reverted. I hope that the wsrep table has been removed already.
Comment by Jan Lindström (Inactive) [ 2019-01-14 ]

marko Above commit should address these comments.

Comment by Marko Mäkelä [ 2019-01-15 ]

I think that a merge from 10.4 would be helpful. Looking at the diff again, I see that quite a few tests are being disabled, often without any comment. A whole new file mysql-test/suite/galera/t/disabled.def is being added, duplicating the file mysql-test/suite/galera/disabled.def. The file mysql-test/suite/galera/r/galera_ist_progress.result contains conflict markers. Many .result files in mysql-test/suite/galera/r add two lines for two connections, including ysql-test/suite/galera/r/galera_sst_rsync2,debug.rdiff. Could this noise be avoided?

The file mysql-test/suite/galera/r/galera_sst_rsync,debug.rdiff is being changed without any good reason.

The test mysql-test/suite/galera/r/galera_parallel_autoinc_largetrx.result is showing a different result for SELECT COUNT(*), that is, 6 instead of 3 rows. Could we please replace COUNT(*) with * when the resultsets are so small? (Use replace_column to mask any nondeterministic columns in the output.) Also, I still see things like SELECT COUNT(*) = 4, which are returning no useful information when the test is failing. In galera.galera_var_slave_threads I see that SELECT COUNT(*) has been degraded to SELECT COUNT(*) = some_constant.

Some files add lines with trailing white space. Some files add lines where TAB is followed by space when it should not be.

Some tests refer to the deprecated variable debug instead of debug_dbug.

In mysql-test/suite/galera/t/MW-86-wait1.test I suspect that there is a race condition:

SET SESSION debug_sync = "now SIGNAL signal.wsrep_apply_cb";
SET debug_sync='RESET';

If the signal was not consumed before the second statement, then it could be lost, causing the test to fail.

Many tests add source include/have_innodb.inc, even though include/galera_cluster.inc already does that. For example, I do not think that we need any change to mysql-test/suite/galera/t/galera_sst_mysqldump.test.

In mysql-test/suite/galera/t/galera_toi_ddl_locking.test there does not seem to be faith in DEBUG_SYNC, because there is an extra 10-second sleep. Why?

--connection node_1
SET DEBUG_SYNC= 'RESET';
SET DEBUG_SYNC = 'alter_table_before_open_tables SIGNAL before_open_tables WAIT_FOR continue';
--send ALTER TABLE t1 ADD COLUMN f2 INTEGER;
--sleep 10
--connection node_1a
SET DEBUG_SYNC= 'now WAIT_FOR before_open_tables';

I do not think that any new Galera-specific files belong in the top-level mysql-test/include directory. The changes to mysql-test/suite/galera/t/mysql-wsrep#332.test and mysql-test/suite/galera_3nodes/t/galera_ist_gcache_rollover.test should be reverted.

Why does mysql-test/suite/galera/t/rpl_row_annotate.test comment out source include/galera_end.inc at the end?

Why are many lines commented out from mysql-test/suite/galera_3nodes/t/galera_ipv6_xtrabackup-v2.test?

I do not quite understand this:

@@ -5515,7 +5515,7 @@ static Sys_var_ulong Sys_wsrep_mysql_replication_bundle(
 
 static Sys_var_mybool Sys_wsrep_load_data_splitting(
        "wsrep_load_data_splitting", "To commit LOAD DATA "
-       "transaction after every 10K rows inserted",
+       "transaction after every 10K rows inserted (deprecating)",
        GLOBAL_VAR(wsrep_load_data_splitting), 
        CMD_LINE(OPT_ARG), DEFAULT(TRUE));

If this setting is supposed to be deprecated (which is what I guess is meant), then why it is still enabled by default? Shouldn't the deprecation warning be issued when the variable is being set?

The file scripts/mysqld_safe.sh is missing quotes around "$wr_logfile_permanent".

Why is the file scripts/wsrep_sst_mysqldump.sh being modified? Or scripts/wsrep_sst_rsync.sh? The latter look like bug fixes that should be done in 10.1 already!

If wsrep_fake_trx_id() is being removed, why are ha_fake_trx_id() and handlerton::fake_trx_id not removed?

It looks like the changes to innobase_kill_query() should be applied to 10.2 already. This could fix hangs in the Galera transaction abort in 10.2 and 10.3.

In wsrep_append_foreign_key(), the format string is wrongly using "%lu" instead of ULINTPF. The index->name will always hold. This whole hunk should be reverted:

@@ -10237,11 +10294,11 @@ wsrep_append_foreign_key(
 
        if (rcode != DB_SUCCESS) {
                WSREP_ERROR(
-                       "FK key set failed: " ULINTPF
-                       " (" ULINTPF " " ULINTPF "), index: %s %s, %s",
-                       rcode, referenced, shared,
-                       (index)       ? index->name() : "void index",
-                       (index && index->table) ? index->table->name.m_name :
+                       "FK key set failed: %lu (%lu %s), index: %s %s, %s",
+                       rcode, referenced, wsrep_key_type_to_str(key_type),
+                       (index && index->name)       ? index->name :
+                               "void index",
+                       (index) ? index->table->name.m_name :
                                "void table",
                        wsrep_thd_query(thd));
                return DB_ERROR;

Do we really have to duplicate some #include in storage/innobase/handler/ha_innodb.h? One #include <mysql/service_wsrep.h> should suffice.

The function wsrep_kill_victim() should remove the local variable bf_this. After the first check, it will always hold.

This change should be reverted:

@@ -1909,6 +1915,23 @@ row_ins_check_foreign_constraint(
                check_table->inc_fk_checks();
 
                lock_wait_suspend_thread(thr);
+#ifdef WITH_WSREP
+               ut_ad(!trx_mutex_own(trx));
+               switch (trx->error_state) {
+               case DB_DEADLOCK:
+                       if (wsrep_debug) {
+                               ib::info() <<
+                               "WSREP: innodb trx state changed during wait "
+                               << " trx: " << trx->id << " with error_state: "
+                               << trx->error_state << " err: " << err;
+                       }
+                       err = trx->error_state;
+                       break;
+               default:
+                       break;
+               }
+
+#endif /* WITH_WSREP */
 
                thr->lock_state = QUE_THR_LOCK_NOLOCK;
 

It looks like a MDEV-16690 fix that was made redundant by MDEV-17541.

Here, do we need parentheses around the last ?: expression? If not, I would suggest adding parentheses around the expression that precedes ?, for clarity.

@@ -1811,7 +1814,10 @@ row_ins_check_foreign_constraint(
                                                rec,
                                                check_index,
                                                check_ref,
-                                               (upd_node) ? TRUE : FALSE);
+                                               upd_node != NULL &&
+                                               wsrep_protocol_version < 4
+                                               ? WSREP_SERVICE_KEY_SHARED
+                                               : WSREP_SERVICE_KEY_REFERENCE);
 #endif /* WITH_WSREP */
                                        goto end_scan;
                                } else if (foreign->type != 0) {

Here, we should add else before the added if in order to avoid unnecessary performance penalty:

@@ -4463,6 +4466,13 @@ row_search_mvcc(
 
                set_also_gap_locks = FALSE;
        }
+#ifdef WITH_WSREP
+       if (wsrep_thd_skip_locking(trx->mysql_thd)) {
+
+               ut_ad(sr_table_name_full_str == prebuilt->table->name.m_name);
+               set_also_gap_locks = FALSE;
+       }
+#endif /* WITH_WSREP */
 
        /* Note that if the search mode was GE or G, then the cursor
        naturally moves upward (in fetch next) in alphabetical order,

There is an unnecessary white-space-only change for storage/innobase/srv/srv0conc.cc.

The following code should be reverted, and instead 10.4 should be merged to the branch:

@@ -2151,7 +2148,14 @@ static my_bool trx_get_trx_by_xid_callback(rw_trx_hash_element_t *element,
         arg->xid->eq(reinterpret_cast<XID*>(trx->xid)))
     {
       /* Invalidate the XID, so that subsequent calls will not find it. */
+#ifdef WITH_WSREP
+      if (!wsrep_is_wsrep_xid(trx->xid))
+      {
+#endif /* WITH_WSREP */
       trx->xid->null();
+#ifdef WITH_WSREP
+      }
+#endif /* WITH_WSREP */
       arg->trx= trx;
       found= 1;
     }

Comment by Marko Mäkelä [ 2019-01-15 ]

One more question: How is sr_table_name_full_str supposed to work? I am a bit afraid that it might not work as intended when there exist FOREIGN KEY constraints, especially with CASCADE or SET NULL, attached to the table.

I am afraid that sr_table_name_full_str might not work with partitioned tables. What is the logic there?

Comment by Seppo Jaakola [ 2019-01-16 ]

all fake_trx_id references should be removed from the code by now

Comment by Marko Mäkelä [ 2019-01-16 ]

I just checked the diff for storage/innobase between 10.4 and the bb-10.4-galera4 once more today, and it looks OK from my perspective. I posted a few suggestions to the Slack channel, which would be nice to have.

I think that sr_table_name_full_str should be changed from const std::string to const char* const, to avoid introducing an unnecessary constraint that Galera and MariaDB must be built with mutually compatible C++ compilers.

That said, I think that the tests need some more attention and clean-up. But that also applies to the Galera 3 tests.

Comment by Daniele Sciascia [ 2019-01-17 ]

I submitted a pull request that changes the type of sr_table_name_full_str, and renamed it wsrep_sr_table_name_full

https://github.com/MariaDB/server/pull/1102

Comment by Jan Lindström (Inactive) [ 2019-01-17 ]

Fixed on branch bb-10.4-galera4

Generated at Thu Feb 08 08:38:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.