Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20139

innodb.innodb_buffer_pool_dump_pct failed in buildbot with timeout in wait_condition.inc

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/kvm-bintar-xenial-amd64/builds/1311

      10.2 666730ee5210dab845bb02bce023e249

      innodb.innodb_buffer_pool_dump_pct 'innodb' w3 [ fail ]
              Test ended at 2019-07-03 08:56:14
       
      CURRENT_TEST: innodb.innodb_buffer_pool_dump_pct
      --- /usr/local/mariadb-10.2.26-linux-x86_64/mysql-test/suite/innodb/r/innodb_buffer_pool_dump_pct.result	2019-07-03 07:37:15.000000000 -0400
      +++ /usr/local/mariadb-10.2.26-linux-x86_64/mysql-test/suite/innodb/r/innodb_buffer_pool_dump_pct.reject	2019-07-03 08:56:14.217187024 -0400
      @@ -17,6 +17,18 @@
       FROM information_schema.global_status
       WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS';
       SET GLOBAL innodb_buffer_pool_dump_now=ON;
      +Timeout in wait_condition.inc for SELECT count(*) = 1
      +FROM information_schema.global_status
      +WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS'
      +AND variable_value != @IBPDS
      +AND variable_value like 'Buffer pool(s) dump completed at%'
      +Id	User	Host	db	Command	Time	State	Info	Progress
      +1	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
      +2	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +3	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +5	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
      +16	root	localhost	test	Query	0	init	show full processlist	0.000
       SET GLOBAL innodb_buffer_pool_dump_pct=DEFAULT;
       SET GLOBAL innodb_buffer_pool_filename=DEFAULT;
       DROP TABLE tab5;
       
      mysqltest: Result length mismatch
      

      Attachments

        Activity

          alice Alice Sherepa added a comment -

          on 10.6 101da87228f11a1d http://buildbot.askmonty.org/buildbot/builders/win32-debug/builds/19914/steps/test/logs/stdio win32

          innodb.innodb_buffer_pool_dump_pct 'innodb' w1 [ fail ]
                  Test ended at 2021-06-23 17:27:33
           
          CURRENT_TEST: innodb.innodb_buffer_pool_dump_pct
          --- D:/win32-debug/build/src/mysql-test/suite/innodb/r/innodb_buffer_pool_dump_pct.result	2021-06-23 16:55:26.220356200 +0000
          +++ D:\win32-debug\build\src\mysql-test\suite\innodb\r\innodb_buffer_pool_dump_pct.reject	2021-06-23 17:27:33.072445100 +0000
          @@ -17,6 +17,13 @@
           FROM information_schema.global_status
           WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS';
           SET GLOBAL innodb_buffer_pool_dump_now=ON;
          +Timeout in wait_condition.inc for SELECT count(*) = 1
          +FROM information_schema.global_status
          +WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS'
          +AND variable_value != @IBPDS
          +AND variable_value like 'Buffer pool(s) dump completed at%'
          +Id	User	Host	db	Command	Time	State	Info	Progress
          +7	root	localhost:52422	test	Query	0	starting	show full processlist	0.000
           SET GLOBAL innodb_buffer_pool_dump_pct=DEFAULT;
           SET GLOBAL innodb_buffer_pool_filename=DEFAULT;
           DROP TABLE tab5;
           
          mysqltest: Result length mismatch
          

          alice Alice Sherepa added a comment - on 10.6 101da87228f11a1d http://buildbot.askmonty.org/buildbot/builders/win32-debug/builds/19914/steps/test/logs/stdio win32 innodb.innodb_buffer_pool_dump_pct 'innodb' w1 [ fail ] Test ended at 2021-06-23 17:27:33   CURRENT_TEST: innodb.innodb_buffer_pool_dump_pct --- D:/win32-debug/build/src/mysql-test/suite/innodb/r/innodb_buffer_pool_dump_pct.result 2021-06-23 16:55:26.220356200 +0000 +++ D:\win32-debug\build\src\mysql-test\suite\innodb\r\innodb_buffer_pool_dump_pct.reject 2021-06-23 17:27:33.072445100 +0000 @@ -17,6 +17,13 @@ FROM information_schema.global_status WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS'; SET GLOBAL innodb_buffer_pool_dump_now=ON; +Timeout in wait_condition.inc for SELECT count(*) = 1 +FROM information_schema.global_status +WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS' +AND variable_value != @IBPDS +AND variable_value like 'Buffer pool(s) dump completed at%' +Id User Host db Command Time State Info Progress +7 root localhost:52422 test Query 0 starting show full processlist 0.000 SET GLOBAL innodb_buffer_pool_dump_pct=DEFAULT; SET GLOBAL innodb_buffer_pool_filename=DEFAULT; DROP TABLE tab5;   mysqltest: Result length mismatch

          This test still fails regularly, several times per month.

          A simple fix would be to not use a separate thread for the buffer pool resizing, but simply make the SET GLOBAL innodb_buffer_pool_size thread to resize the buffer pool and return when done. ralf.gebhardt, would that be an acceptable change of behaviour in GA releases?

          marko Marko Mäkelä added a comment - This test still fails regularly, several times per month. A simple fix would be to not use a separate thread for the buffer pool resizing, but simply make the SET GLOBAL innodb_buffer_pool_size thread to resize the buffer pool and return when done. ralf.gebhardt , would that be an acceptable change of behaviour in GA releases?
          ralf.gebhardt Ralf Gebhardt added a comment -

          If this is a way to fix the bug while it is not breaking compatibility, why not? Is there anything special to this bug fix compared to others? Are there any user visible changes which would require changes in an application?

          ralf.gebhardt Ralf Gebhardt added a comment - If this is a way to fix the bug while it is not breaking compatibility, why not? Is there anything special to this bug fix compared to others? Are there any user visible changes which would require changes in an application?

          Sorry, I meant that SET GLOBAL innodb_buffer_dump_now=ON could block until the operation has been completed. There is some similar logic around SET GLOBAL innodb_buffer_pool_size as well.

          Given that the failures are rather rare and possibly occur when the CI workers are under heavy load, I suspect that the failure probability could be reduced by the following:

          diff --git a/mysql-test/suite/innodb/t/innodb_buffer_pool_dump_pct.test b/mysql-test/suite/innodb/t/innodb_buffer_pool_dump_pct.test
          index 381091165ef..ecdeb0eda21 100644
          --- a/mysql-test/suite/innodb/t/innodb_buffer_pool_dump_pct.test
          +++ b/mysql-test/suite/innodb/t/innodb_buffer_pool_dump_pct.test
          @@ -46,9 +46,9 @@ FROM information_schema.global_status
           WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS';
           SET GLOBAL innodb_buffer_pool_dump_now=ON;
           
          -# Sleep one second in order to ensure that the time stamp is
          +# Sleep at least one second in order to ensure that the time stamp is
           # different at next dump
          ---sleep 1
          +--sleep 1.5
           
           let $wait_condition = SELECT count(*) = 1
           FROM information_schema.global_status
          @@ -70,9 +70,9 @@ WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS';
           
           SET GLOBAL innodb_buffer_pool_dump_now=ON;
           
          -# Sleep one second in order to ensure that the time stamp is
          +# Sleep at least one second in order to ensure that the time stamp is
           # different at next dump
          ---sleep 1
          +--sleep 1.5
           
           let $wait_condition = SELECT count(*) = 1
           FROM information_schema.global_status
          

          That is, under some circumstances the time stamp in the status message (measured in seconds) does not advance even though the client attempted to sleep for 1 second. The sleep statement is implemented by do_sleep(), which invokes my_sleep(). It will invoke the Microsoft Windows Sleep() or POSIX select(2), ignoring any errors. On GNU/Linux, the timeout interval will be rounded up to the system clock granularity, so presumably the sleep should not be any shorter than requested. However, EINTR is documented as a possible error. On Microsoft Windows (where this test has failed as well), it is documented that the actual sleep duration may be less than the requested delay.

          I am not convinced that increasing the delay is a good approach. I think that it is better to first wait that the file ib_buffer_pool has been created, and then wait that the status message indicates completion. The actual ‘interesting part’ of the test is to show that the size of buffer pool dumps depends on the parameter innodb_buffer_pool_dump_pct.

          marko Marko Mäkelä added a comment - Sorry, I meant that SET GLOBAL innodb_buffer_dump_now=ON could block until the operation has been completed. There is some similar logic around SET GLOBAL innodb_buffer_pool_size as well. Given that the failures are rather rare and possibly occur when the CI workers are under heavy load, I suspect that the failure probability could be reduced by the following: diff --git a/mysql-test/suite/innodb/t/innodb_buffer_pool_dump_pct.test b/mysql-test/suite/innodb/t/innodb_buffer_pool_dump_pct.test index 381091165ef..ecdeb0eda21 100644 --- a/mysql-test/suite/innodb/t/innodb_buffer_pool_dump_pct.test +++ b/mysql-test/suite/innodb/t/innodb_buffer_pool_dump_pct.test @@ -46,9 +46,9 @@ FROM information_schema.global_status WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS'; SET GLOBAL innodb_buffer_pool_dump_now=ON; -# Sleep one second in order to ensure that the time stamp is +# Sleep at least one second in order to ensure that the time stamp is # different at next dump ---sleep 1 +--sleep 1.5 let $wait_condition = SELECT count(*) = 1 FROM information_schema.global_status @@ -70,9 +70,9 @@ WHERE variable_name = 'INNODB_BUFFER_POOL_DUMP_STATUS'; SET GLOBAL innodb_buffer_pool_dump_now=ON; -# Sleep one second in order to ensure that the time stamp is +# Sleep at least one second in order to ensure that the time stamp is # different at next dump ---sleep 1 +--sleep 1.5 let $wait_condition = SELECT count(*) = 1 FROM information_schema.global_status That is, under some circumstances the time stamp in the status message (measured in seconds) does not advance even though the client attempted to sleep for 1 second. The sleep statement is implemented by do_sleep() , which invokes my_sleep() . It will invoke the Microsoft Windows Sleep() or POSIX select(2) , ignoring any errors. On GNU/Linux, the timeout interval will be rounded up to the system clock granularity, so presumably the sleep should not be any shorter than requested. However, EINTR is documented as a possible error. On Microsoft Windows (where this test has failed as well), it is documented that the actual sleep duration may be less than the requested delay. I am not convinced that increasing the delay is a good approach. I think that it is better to first wait that the file ib_buffer_pool has been created, and then wait that the status message indicates completion. The actual ‘interesting part’ of the test is to show that the size of buffer pool dumps depends on the parameter innodb_buffer_pool_dump_pct .

          The test got more than 100× faster on Microsoft Windows Debug:

          innodb.innodb_buffer_pool_dump_pct 'innodb' w26 [ pass ]    169
          

          On my Debian GNU/Linux system, the test would at best complete in 14 or 19 milliseconds, for RelWithDebInfo and Debug.

          marko Marko Mäkelä added a comment - The test got more than 100× faster on Microsoft Windows Debug: innodb.innodb_buffer_pool_dump_pct 'innodb' w26 [ pass ] 169 On my Debian GNU/Linux system, the test would at best complete in 14 or 19 milliseconds, for RelWithDebInfo and Debug.

          People

            marko Marko Mäkelä
            elenst Elena Stepanova
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.