|
It is a basic race condition + undesirable interrelation between tests. It is easy reproduce by adding a sleep somewhere here:
diff --git a/storage/xtradb/buf/buf0dump.cc b/storage/xtradb/buf/buf0dump.cc
|
index 2b3b506..60e2f36 100644
|
--- a/storage/xtradb/buf/buf0dump.cc
|
+++ b/storage/xtradb/buf/buf0dump.cc
|
@@ -193,6 +193,8 @@ buf_dump(
|
"%s%c%s", srv_data_home, SRV_PATH_SEPARATOR,
|
srv_buf_dump_filename);
|
|
+ my_sleep(1000000);
|
+
|
ut_snprintf(tmp_filename, sizeof(tmp_filename),
|
"%s.incomplete", full_filename);
|
And running two tests in a specific order:
perl ./mtr sys_vars.innodb_buffer_pool_filename_basic,xtradb sys_vars.innodb_buffer_pool_load_now_basic,xtradb --noreorder
|
What happens here is that the first test, innodb_buffer_pool_filename_basic, executes buffer pool dump into a different file name and sets the status variable innodb_buffer_pool_dump_status 'Buffer pool(s) dump completed at'. It then properly restores the file name to the default value.
Then the second test (the one in question) starts executing. It triggers buffer pool dump in the background thread and starts watching the status variable innodb_buffer_pool_dump_status waiting for it to say that the dump was completed. If it checks the value for the first time after the dump started, it's fine, because the value will be set to something about "in progress", and will be changed to "completed" only when the file is created. But if it checks the value before the dump started, which we cause by adding the sleep, the previous status variable value is still "dump completed", so the wait ends immediately. However, the file isn't there yet (because the previous test dumped into a different file name, remember?). Hence the error.
|
|
To jplindst:
here is one way to fix it. Please check if you agree, or you can think of something better.
For a note, the added wait condition is just to avoid a sleep. It saves ~1 sec execution time, but if you think it's not worth the complexity, please feel free to replace it with sleep 1.
The additional clause in the second wait condition is necessary anyway.
diff --git a/mysql-test/suite/sys_vars/t/innodb_buffer_pool_load_now_basic.test b/mysql-test/suite/sys_vars/t/innodb_buffer_pool_load_now_basic.test
|
index fe376a3..5cf8156 100644
|
--- a/mysql-test/suite/sys_vars/t/innodb_buffer_pool_load_now_basic.test
|
+++ b/mysql-test/suite/sys_vars/t/innodb_buffer_pool_load_now_basic.test
|
@@ -8,12 +8,20 @@
|
SET @orig = @@global.innodb_buffer_pool_load_now;
|
SELECT @orig;
|
|
+let $old_status= `SELECT variable_value FROM information_schema.global_status
|
+ WHERE LOWER(variable_name) = 'innodb_buffer_pool_dump_status'`;
|
+
|
+let $wait_condition =
|
+ SELECT TRIM(SUBSTR('$old_status', -8)) != DATE_FORMAT(CURTIME(), '%l:%i:%s');
|
+-- source include/wait_condition.inc
|
+
|
# Do the dump
|
SET GLOBAL innodb_buffer_pool_dump_now = ON;
|
|
# Wait for the dump to complete
|
let $wait_condition =
|
- SELECT SUBSTR(variable_value, 1, 33) = 'Buffer pool(s) dump completed at '
|
+ SELECT variable_value != '$old_status'
|
+ AND SUBSTR(variable_value, 1, 33) = 'Buffer pool(s) dump completed at '
|
FROM information_schema.global_status
|
WHERE LOWER(variable_name) = 'innodb_buffer_pool_dump_status';
|
-- source include/wait_condition.inc
|
|
|
Got a new failure with the change – also sporadic, happened only once:
http://buildbot.askmonty.org/buildbot/builders/kvm-deb-precise-amd64/builds/5853/steps/test_4/logs/stdio
sys_vars.innodb_buffer_pool_load_now_basic 'innodb_plugin' w3 [ fail ]
|
Test ended at 2016-03-12 20:13:43
|
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_load_now_basic
|
--- /usr/share/mysql/mysql-test/suite/sys_vars/r/innodb_buffer_pool_load_now_basic.result 2016-03-12 18:03:51.000000000 +0200
|
+++ /run/shm/var/3/log/innodb_buffer_pool_load_now_basic.reject 2016-03-12 20:13:43.294185838 +0200
|
@@ -3,6 +3,12 @@
|
@orig
|
0
|
SET GLOBAL innodb_buffer_pool_dump_now = ON;
|
+Timeout in wait_condition.inc for SELECT variable_value != 'Buffer pool(s) dump completed at 160312 20:13:12'
|
+ AND SUBSTR(variable_value, 1, 33) = 'Buffer pool(s) dump completed at '
|
+ FROM information_schema.global_status
|
+WHERE LOWER(variable_name) = 'innodb_buffer_pool_dump_status'
|
+Id User Host db Command Time State Info Progress
|
+30 root localhost test Query 0 init show full processlist 0.000
|
SET GLOBAL innodb_buffer_pool_load_now = ON;
|
SELECT variable_value
|
FROM information_schema.global_status
|
From the error log, it's pretty clear why the wait condition timed out:
CURRENT_TEST: sys_vars.innodb_buffer_pool_filename_basic
|
2016-03-12 20:13:12 7facde3fd700 InnoDB: Dumping buffer pool(s) to .//innodb_foobar_dump
|
2016-03-12 20:13:12 7facde3fd700 InnoDB: Buffer pool(s) dump completed at 160312 20:13:12
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_instances_basic
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_load_abort_basic
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_load_at_startup_basic
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_load_now_basic
|
2016-03-12 20:13:12 7facde3fd700 InnoDB: Dumping buffer pool(s) to .//ib_buffer_pool
|
2016-03-12 20:13:12 7facde3fd700 InnoDB: Buffer pool(s) dump completed at 160312 20:13:12
|
2016-03-12 20:13:43 7facde3fd700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
|
2016-03-12 20:13:43 7facde3fd700 InnoDB: Buffer pool(s) load completed at 160312 20:13:43
|
So, the clever wait for the next second did not work, the test performed the dump the same second as the previous one happened, hence the variable value never changed, and wait condition did not succeed.
The problem is, I have no idea why the wait for the next second did not work, it's pretty straightforward. Unfortunately, there is no general log to see the details. I thought maybe there is some timezone specifics on the machine, or something of the sort, but I ran the same VM, and there seems to be nothing special about it, everything works as expected. I'll keep thinking.
|
|
It turned out to be easy enough to reproduce by running the test with repeat:
....
|
sys_vars.innodb_buffer_pool_load_now_basic 'innodb_plugin' [ pass ] 713
|
sys_vars.innodb_buffer_pool_load_now_basic 'innodb_plugin' [ pass ] 814
|
sys_vars.innodb_buffer_pool_load_now_basic 'innodb_plugin' [ pass ] 813
|
sys_vars.innodb_buffer_pool_load_now_basic 'innodb_plugin' [ fail ]
|
Test ended at 2016-03-12 20:10:07
|
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_load_now_basic
|
--- /bld/10.0-release/mysql-test/suite/sys_vars/r/innodb_buffer_pool_load_now_basic.result 2016-01-17 10:16:40.000000000 -0500
|
+++ /bld/10.0-release/mysql-test/suite/sys_vars/r/innodb_buffer_pool_load_now_basic.reject 2016-03-12 20:10:07.434489216 -0500
|
@@ -3,6 +3,12 @@
|
@orig
|
0
|
SET GLOBAL innodb_buffer_pool_dump_now = ON;
|
+Timeout in wait_condition.inc for SELECT variable_value != 'Buffer pool(s) dump completed at 160312 20:09:36'
|
+ AND SUBSTR(variable_value, 1, 33) = 'Buffer pool(s) dump completed at '
|
+ FROM information_schema.global_status
|
+WHERE LOWER(variable_name) = 'innodb_buffer_pool_dump_status'
|
+Id User Host db Command Time State Info Progress
|
+79 root localhost test Query 0 init show full processlist 0.000
|
SET GLOBAL innodb_buffer_pool_load_now = ON;
|
SELECT variable_value
|
FROM information_schema.global_status
|
Here is what we have in the error log:
...
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_load_now_basic
|
2016-03-12 20:09:35 7fd8705fb700 InnoDB: Dumping buffer pool(s) to .//ib_buffer_pool
|
2016-03-12 20:09:35 7fd8705fb700 InnoDB: Buffer pool(s) dump completed at 160312 20:09:35
|
2016-03-12 20:09:35 7fd8705fb700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
|
2016-03-12 20:09:35 7fd8705fb700 InnoDB: Buffer pool(s) load completed at 160312 20:09:35
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_load_now_basic
|
2016-03-12 20:09:36 7fd8705fb700 InnoDB: Dumping buffer pool(s) to .//ib_buffer_pool
|
2016-03-12 20:09:36 7fd8705fb700 InnoDB: Buffer pool(s) dump completed at 160312 20:09:36
|
2016-03-12 20:09:36 7fd8705fb700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
|
2016-03-12 20:09:36 7fd8705fb700 InnoDB: Buffer pool(s) load completed at 160312 20:09:36
|
CURRENT_TEST: sys_vars.innodb_buffer_pool_load_now_basic
|
2016-03-12 20:09:36 7fd8705fb700 InnoDB: Dumping buffer pool(s) to .//ib_buffer_pool
|
2016-03-12 20:09:36 7fd8705fb700 InnoDB: Buffer pool(s) dump completed at 160312 20:09:36
|
2016-03-12 20:10:07 7fd8705fb700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
|
2016-03-12 20:10:07 7fd8705fb700 InnoDB: Buffer pool(s) load completed at 160312 20:10:07
|
That is, the last test says the dump was performed on the same second as the previous one.
But here is what we have in the general log:
79 Query SELECT TRIM(SUBSTR('Buffer pool(s) dump completed at 160312 20:09:36', -8)) != DATE_FORMAT(CURTIME(), '%k:%i:%s')
|
79 Query SELECT TRIM(SUBSTR('Buffer pool(s) dump completed at 160312 20:09:36', -8)) != DATE_FORMAT(CURTIME(), '%k:%i:%s')
|
79 Query SELECT TRIM(SUBSTR('Buffer pool(s) dump completed at 160312 20:09:36', -8)) != DATE_FORMAT(CURTIME(), '%k:%i:%s')
|
79 Query SELECT TRIM(SUBSTR('Buffer pool(s) dump completed at 160312 20:09:36', -8)) != DATE_FORMAT(CURTIME(), '%k:%i:%s')
|
79 Query SELECT TRIM(SUBSTR('Buffer pool(s) dump completed at 160312 20:09:36', -8)) != DATE_FORMAT(CURTIME(), '%k:%i:%s')
|
79 Query SELECT TRIM(SUBSTR('Buffer pool(s) dump completed at 160312 20:09:36', -8)) != DATE_FORMAT(CURTIME(), '%k:%i:%s')
|
79 Query SELECT TRIM(SUBSTR('Buffer pool(s) dump completed at 160312 20:09:36', -8)) != DATE_FORMAT(CURTIME(), '%k:%i:%s')
|
160312 20:09:37 79 Query SELECT TRIM(SUBSTR('Buffer pool(s) dump completed at 160312 20:09:36', -8)) != DATE_FORMAT(CURTIME(), '%k:%i:%s')
|
79 Query SET GLOBAL innodb_buffer_pool_dump_now = ON
|
79 Query SELECT variable_value != 'Buffer pool(s) dump completed at 160312 20:09:36'
|
AND SUBSTR(variable_value, 1, 33) = 'Buffer pool(s) dump completed at '
|
FROM information_schema.global_status
|
So, the waiting on the current time worked perfectly – the test did wait till CURTIME switched to the next second, only only then, at 20:09:37, the dump was performed.
However, InnoDB still wrote its records to the error log with the older timestamp, and used the same for the variable value. Further failure is caused by this inconsistency.
Thinking about it, it's no news that InnoDB has its own understanding of time, but I didn't know it would affect seconds. I don't know if it's a bug, maybe it is, but it's so tiny that I doubt anybody would care fixing it. Just for accuracy, lets ask jplindst: see the explanation above – do you want to look into why InnoDB would use different time comparing to the server?
If not, I'll think of some other way to fix the test. sleep 1 also does not help here, it can hit the same problem; maybe longer sleeps could do the trick, but then again, if InnoDB can be 1 second late, maybe it can be 10 seconds late?
On a separate note, while running the test repeatedly I found out that the final part of the test, the LOAD part, also does not work properly, on a similar reason to what we had here initially: if the status variable was previously set, the last SELECT from I_S can happen too early and show an unexpected value. I'll fix it along with re-fixing the current issue.
|
|
jplindst,
I don't think it can be so. First, statements are written to the general log before they are executed, otherwise we would never have crashing queries in general logs, and we always do.
But more importantly, even the previous statement, the last SELECT TRIM, is also written with the newer timestamp. Now, it cannot possibly be explained by InnoDB writing its log before the SET statement was written to the general log. By this time, InnoDB knew nothing about future request for the dump.
|