[MDEV-17705] Review InnoDB changes on Galera 4 wsrep patch Created: 2018-11-14 Updated: 2019-03-20 Resolved: 2019-01-17 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Galera, Storage Engine - InnoDB |
| Fix Version/s: | 10.4.2 |
| Type: | Task | Priority: | Critical |
| Reporter: | Jan Lindström (Inactive) | Assignee: | Jan Lindström (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | galera_4 | ||
| Issue Links: |
|
||||||||
| Description |
|
Related to: https://github.com/codership/mariaDB-wsrep/issues/49 |
| Comments |
| Comment by Marko Mäkelä [ 2018-11-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Currently, the latest code seems to be bb-10.4-galera4.
I have requested that the functions get_wsrep_xid_uuid() and get_wsrep_xid_seqno() be renamed back, to lose the get_ prefix. That would reduce the unnecessary diff. I also see that some occurrences of the function wsrep_must_process_fk() have been modified. Why is that? Can that function not be modified instead? And why would we pay an extra penalty for the "Galera is disabled" case by putting the expensive check upfront? Here are a couple examples that the above command is producing for me:
Why was this occurrence of wsrep_must_process_fk() replaced, but others were preserved?
Why do we evaluate wsrep_on() before evaluating simpler conditions?
Also, currently the called function is defined in an inefficient way:
I think that the call to wsrep_on_trx(trx) should be last, that is, something like this:
(This change should be applied to 10.1 already.) These two forms should be equivalent, but the call wsrep_on((trx)->mysql_thd)) a.k.a. wsrep_service->wsrep_on_func(trx->mysql_thd) is moved to the end. jplindst, please notify me when the above issues have been addressed. I would like to review the remaining InnoDB changes. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2018-12-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Above changes and additional unnecessary changes are addressed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2018-12-19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
in branch bb-10.4-galera4 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-12-28 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I can still see changes that do not make sense to me. Here is an example:
Why was the preceding conditional assignment not removed? Also, it seems to me that the latest merge from 10.4 was more than 1 month ago (November 20). Please merge more often, and read and minimize the diff yourself, before requesting a new review. Why are the following files added or changed?
The added file mysql-test/include/have_ipv6.inc appears to (almost) duplicate the existing file mysql-test/suite/galera_3nodes/include/have_ipv6.inc. I do not think that any subsystem-specific files should be added to mysql-test/include. They belong in the test suites of the subsystems. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2018-12-28 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
We have more recent 10.4 merge on regression testing as MDL-locks were refactored so that current Galera implementation does not work anymore. Some of those includes are used on several suites thus they are placed main one. I will go through them and see if some of them can be moved. Those files are changed as changes are required, I will double check them but some of those changes need to be there. Anyway, in this MDEV I will try to concentrate InnoDB only changes. There is also MDEV for server changes. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-12-28 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For example, suite/innodb/include/wait_all_purged.inc can be included from InnoDB-specific tests in other test suites just fine. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2019-01-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
commit f5769e497dd1d1b7355b92a0c51970a94726025c Remove unnecessary changes and fix coding style. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-01-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you for the merge.
Here are some highlights:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2019-01-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
marko Above commit should address these comments. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-01-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think that a merge from 10.4 would be helpful. Looking at the diff again, I see that quite a few tests are being disabled, often without any comment. A whole new file mysql-test/suite/galera/t/disabled.def is being added, duplicating the file mysql-test/suite/galera/disabled.def. The file mysql-test/suite/galera/r/galera_ist_progress.result contains conflict markers. Many .result files in mysql-test/suite/galera/r add two lines for two connections, including ysql-test/suite/galera/r/galera_sst_rsync2,debug.rdiff. Could this noise be avoided? The file mysql-test/suite/galera/r/galera_sst_rsync,debug.rdiff is being changed without any good reason. The test mysql-test/suite/galera/r/galera_parallel_autoinc_largetrx.result is showing a different result for SELECT COUNT(*), that is, 6 instead of 3 rows. Could we please replace COUNT(*) with * when the resultsets are so small? (Use replace_column to mask any nondeterministic columns in the output.) Also, I still see things like SELECT COUNT(*) = 4, which are returning no useful information when the test is failing. In galera.galera_var_slave_threads I see that SELECT COUNT(*) has been degraded to SELECT COUNT(*) = some_constant. Some files add lines with trailing white space. Some files add lines where TAB is followed by space when it should not be. Some tests refer to the deprecated variable debug instead of debug_dbug. In mysql-test/suite/galera/t/MW-86-wait1.test I suspect that there is a race condition:
If the signal was not consumed before the second statement, then it could be lost, causing the test to fail. Many tests add source include/have_innodb.inc, even though include/galera_cluster.inc already does that. For example, I do not think that we need any change to mysql-test/suite/galera/t/galera_sst_mysqldump.test. In mysql-test/suite/galera/t/galera_toi_ddl_locking.test there does not seem to be faith in DEBUG_SYNC, because there is an extra 10-second sleep. Why?
I do not think that any new Galera-specific files belong in the top-level mysql-test/include directory. The changes to mysql-test/suite/galera/t/mysql-wsrep#332.test and mysql-test/suite/galera_3nodes/t/galera_ist_gcache_rollover.test should be reverted. Why does mysql-test/suite/galera/t/rpl_row_annotate.test comment out source include/galera_end.inc at the end? Why are many lines commented out from mysql-test/suite/galera_3nodes/t/galera_ipv6_xtrabackup-v2.test? I do not quite understand this:
If this setting is supposed to be deprecated (which is what I guess is meant), then why it is still enabled by default? Shouldn't the deprecation warning be issued when the variable is being set? The file scripts/mysqld_safe.sh is missing quotes around "$wr_logfile_permanent". Why is the file scripts/wsrep_sst_mysqldump.sh being modified? Or scripts/wsrep_sst_rsync.sh? The latter look like bug fixes that should be done in 10.1 already! If wsrep_fake_trx_id() is being removed, why are ha_fake_trx_id() and handlerton::fake_trx_id not removed? It looks like the changes to innobase_kill_query() should be applied to 10.2 already. This could fix hangs in the Galera transaction abort in 10.2 and 10.3. In wsrep_append_foreign_key(), the format string is wrongly using "%lu" instead of ULINTPF. The index->name will always hold. This whole hunk should be reverted:
Do we really have to duplicate some #include in storage/innobase/handler/ha_innodb.h? One #include <mysql/service_wsrep.h> should suffice. The function wsrep_kill_victim() should remove the local variable bf_this. After the first check, it will always hold. This change should be reverted:
It looks like a Here, do we need parentheses around the last ?: expression? If not, I would suggest adding parentheses around the expression that precedes ?, for clarity.
Here, we should add else before the added if in order to avoid unnecessary performance penalty:
There is an unnecessary white-space-only change for storage/innobase/srv/srv0conc.cc. The following code should be reverted, and instead 10.4 should be merged to the branch:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-01-15 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
One more question: How is sr_table_name_full_str supposed to work? I am a bit afraid that it might not work as intended when there exist FOREIGN KEY constraints, especially with CASCADE or SET NULL, attached to the table. I am afraid that sr_table_name_full_str might not work with partitioned tables. What is the logic there? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Seppo Jaakola [ 2019-01-16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
all fake_trx_id references should be removed from the code by now | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2019-01-16 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I just checked the diff for storage/innobase between 10.4 and the bb-10.4-galera4 once more today, and it looks OK from my perspective. I posted a few suggestions to the Slack channel, which would be nice to have. I think that sr_table_name_full_str should be changed from const std::string to const char* const, to avoid introducing an unnecessary constraint that Galera and MariaDB must be built with mutually compatible C++ compilers. That said, I think that the tests need some more attention and clean-up. But that also applies to the Galera 3 tests. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniele Sciascia [ 2019-01-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I submitted a pull request that changes the type of sr_table_name_full_str, and renamed it wsrep_sr_table_name_full | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Jan Lindström (Inactive) [ 2019-01-17 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Fixed on branch bb-10.4-galera4 |