Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20928

Galera test failure on galera.galera_var_innodb_disallow_writes: Result length mismatch

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.3.21, 10.4.9
    • 10.2.33, 10.3.24, 10.4.14, 10.5.5
    • Galera, Tests
    • None
    • OS: CentOS Linux release 7.6.1810 (Core). TestTarball_2 rhel-8.

    Description

      It sporadically fails on bb and local suite runs.

      MariaDB Version 10.4.9-MariaDB-debug: Repository: MariaDB/server; branch 10.4; Revision 9afbb1069ab4178741b76adefacd3f6a0ed51369; debug built from sources.

      Galera lib 26.4.3(r4535): Repository: MariaDB/galera; branch mariadb-4.x; Revision 752664dc3c7065d8e0c73ac99d0028a5f84eb250; debug built from sources.

      Run:

      ./mtr --suite galera --testcase-timeout=90 --force --big-test --max-test-fail=0 --timer --timestamp --timediff
      

      stdout.log (the whole file is attached: 191029_HDD_stdout.log):

      mysqltest: Result length mismatch

      191029 20:10:41 +55.136 galera.galera_var_innodb_disallow_writes 'innodb' [ fail ]
              Test ended at 2019-10-29 20:10:41
       
      CURRENT_TEST: galera.galera_var_innodb_disallow_writes
      --- /home/stepan/mariadb/10.4/github/mysql-test/suite/galera/r/galera_var_innodb_disallow_writes.result	2019-10-17 14:30:27.382278020 +0300
      +++ /home/stepan/mariadb/10.4/github/mysql-test/suite/galera/r/galera_var_innodb_disallow_writes.reject	2019-10-29 20:10:41.229513794 +0200
      @@ -10,6 +10,18 @@
       SELECT COUNT(*) = 1 FROM t1;
       COUNT(*) = 1
       0
      +Timeout in wait_condition.inc for SELECT 1 FROM INFORMATION_SCHEMA.PROCESSLIST WHERE INFO = 'INSERT INTO t1 VALUES (1)' AND State = 'Commit'
      +Id	User	Host	db	Command	Time	State	Info	Progress
      +1	system user		NULL	Sleep	43	wsrep aborter idle	NULL	0.000
      +2	system user		NULL	Sleep	43	Closing tables	NULL	0.000
      +3	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
      +4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +5	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
      +7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
      +18	root	localhost	test	Sleep	35		NULL	0.000
      +19	root	localhost:48696	test	Query	31	Update	INSERT INTO t1 VALUES (1)	0.000
      +20	root	localhost:48698	test	Query	0	Init	show full processlist	0.000
       SELECT COUNT(*) = 1 FROM t1;
       COUNT(*) = 1
       0
       
      mysqltest: Result length mismatch
      

      See also other logs attached: 191030_galera_var_innodb_disallow_writes.zip.

      Attachments

        Activity

          It also sporadically fails on Azure.

          stdout.log:

          10.3.21-5 76cd6d4б rhel-8

          2020-01-12T03:36:58.8333020Z galera.galera_var_innodb_disallow_writes 'innodb' w1 [ fail ]
          2020-01-12T03:36:58.8334667Z         Test ended at 2020-01-12 03:36:58
          2020-01-12T03:36:58.8335002Z 
          2020-01-12T03:36:58.8335902Z CURRENT_TEST: galera.galera_var_innodb_disallow_writes
          2020-01-12T03:36:58.8338173Z --- /__w/1/s/mysql-test/suite/galera/r/galera_var_innodb_disallow_writes.result	2020-01-12 01:04:46.000000000 +0000
          2020-01-12T03:36:58.8338910Z +++ /__w/1/s/mysql-test/suite/galera/r/galera_var_innodb_disallow_writes.reject	2020-01-12 03:36:58.461969323 +0000
          2020-01-12T03:36:58.8339402Z @@ -8,6 +8,18 @@
          2020-01-12T03:36:58.8339644Z  SELECT COUNT(*) = 1 FROM t1;
          2020-01-12T03:36:58.8339858Z  COUNT(*) = 1
          2020-01-12T03:36:58.8340042Z  0
          2020-01-12T03:36:58.8340597Z +Timeout in wait_condition.inc for SELECT 1 FROM INFORMATION_SCHEMA.PROCESSLIST WHERE INFO = 'INSERT INTO t1 VALUES (1)' AND State = 'Commit'
          2020-01-12T03:36:58.8340916Z +Id	User	Host	db	Command	Time	State	Info	Progress
          2020-01-12T03:36:58.8341161Z +1	system user		NULL	Sleep	108	WSREP aborter idle	NULL	0.000
          2020-01-12T03:36:58.8341392Z +2	system user		NULL	Sleep	33	Committed 87	NULL	0.000
          2020-01-12T03:36:58.8341604Z +3	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
          2020-01-12T03:36:58.8341841Z +4	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
          2020-01-12T03:36:58.8342073Z +5	system user		NULL	Daemon	NULL	InnoDB purge coordinator	NULL	0.000
          2020-01-12T03:36:58.8342314Z +6	system user		NULL	Daemon	NULL	InnoDB purge worker	NULL	0.000
          2020-01-12T03:36:58.8342549Z +7	system user		NULL	Daemon	NULL	InnoDB shutdown handler	NULL	0.000
          2020-01-12T03:36:58.8342763Z +101	root	localhost	test	Sleep	33		NULL	0.000
          2020-01-12T03:36:58.8342996Z +102	root	localhost:34018	test	Query	30	Update	INSERT INTO t1 VALUES (1)	0.000
          2020-01-12T03:36:58.8343239Z +103	root	localhost:34020	test	Query	0	Init	show full processlist	0.000
          2020-01-12T03:36:58.8345241Z  SELECT COUNT(*) = 1 FROM t1;
          2020-01-12T03:36:58.8345511Z  COUNT(*) = 1
          2020-01-12T03:36:58.8345699Z  0
          2020-01-12T03:36:58.8345852Z 
          2020-01-12T03:36:58.8346020Z mysqltest: Result length mismatch
          2020-01-12T03:36:58.8346188Z 
          2020-01-12T03:36:58.8399015Z  - saving '/var/tmp/mtr/1/log/galera.galera_var_innodb_disallow_writes-innodb/' to '/var/tmp/mtr/log/galera.galera_var_innodb_disallow_writes-innodb/'
          

          stepan.patryshev Stepan Patryshev (Inactive) added a comment - - edited It also sporadically fails on Azure. stdout.log : 10.3.21-5 76cd6d4б rhel-8 2020-01-12T03:36:58.8333020Z galera.galera_var_innodb_disallow_writes 'innodb' w1 [ fail ] 2020-01-12T03:36:58.8334667Z Test ended at 2020-01-12 03:36:58 2020-01-12T03:36:58.8335002Z 2020-01-12T03:36:58.8335902Z CURRENT_TEST: galera.galera_var_innodb_disallow_writes 2020-01-12T03:36:58.8338173Z --- /__w/1/s/mysql-test/suite/galera/r/galera_var_innodb_disallow_writes.result 2020-01-12 01:04:46.000000000 +0000 2020-01-12T03:36:58.8338910Z +++ /__w/1/s/mysql-test/suite/galera/r/galera_var_innodb_disallow_writes.reject 2020-01-12 03:36:58.461969323 +0000 2020-01-12T03:36:58.8339402Z @@ -8,6 +8,18 @@ 2020-01-12T03:36:58.8339644Z SELECT COUNT(*) = 1 FROM t1; 2020-01-12T03:36:58.8339858Z COUNT(*) = 1 2020-01-12T03:36:58.8340042Z 0 2020-01-12T03:36:58.8340597Z +Timeout in wait_condition.inc for SELECT 1 FROM INFORMATION_SCHEMA.PROCESSLIST WHERE INFO = 'INSERT INTO t1 VALUES (1)' AND State = 'Commit' 2020-01-12T03:36:58.8340916Z +Id User Host db Command Time State Info Progress 2020-01-12T03:36:58.8341161Z +1 system user NULL Sleep 108 WSREP aborter idle NULL 0.000 2020-01-12T03:36:58.8341392Z +2 system user NULL Sleep 33 Committed 87 NULL 0.000 2020-01-12T03:36:58.8341604Z +3 system user NULL Daemon NULL InnoDB purge worker NULL 0.000 2020-01-12T03:36:58.8341841Z +4 system user NULL Daemon NULL InnoDB purge worker NULL 0.000 2020-01-12T03:36:58.8342073Z +5 system user NULL Daemon NULL InnoDB purge coordinator NULL 0.000 2020-01-12T03:36:58.8342314Z +6 system user NULL Daemon NULL InnoDB purge worker NULL 0.000 2020-01-12T03:36:58.8342549Z +7 system user NULL Daemon NULL InnoDB shutdown handler NULL 0.000 2020-01-12T03:36:58.8342763Z +101 root localhost test Sleep 33 NULL 0.000 2020-01-12T03:36:58.8342996Z +102 root localhost:34018 test Query 30 Update INSERT INTO t1 VALUES (1) 0.000 2020-01-12T03:36:58.8343239Z +103 root localhost:34020 test Query 0 Init show full processlist 0.000 2020-01-12T03:36:58.8345241Z SELECT COUNT(*) = 1 FROM t1; 2020-01-12T03:36:58.8345511Z COUNT(*) = 1 2020-01-12T03:36:58.8345699Z 0 2020-01-12T03:36:58.8345852Z 2020-01-12T03:36:58.8346020Z mysqltest: Result length mismatch 2020-01-12T03:36:58.8346188Z 2020-01-12T03:36:58.8399015Z - saving '/var/tmp/mtr/1/log/galera.galera_var_innodb_disallow_writes-innodb/' to '/var/tmp/mtr/log/galera.galera_var_innodb_disallow_writes-innodb/'

          Increased the priority since it failed on Azure.

          stepan.patryshev Stepan Patryshev (Inactive) added a comment - Increased the priority since it failed on Azure.
          seppo Seppo Jaakola added a comment -

          This test can end up in mutex deadlock between innodb background threads and two test connection thread executions:

          2020-06-18 15:37:20 0 [Warning] InnoDB: A long semaphore wait:
          --Thread 139712668890880 has waited at row0row.cc line 1212 for 241.00 seconds the semaphore:
          X-lock on RW-latch at 0x7f1173416308 created in file buf0buf.cc line 1580
          a writer (thread id 0) has reserved it in mode SX
          number of readers 0, waiters flag 1, lock_word: 10000000
          Last time write locked in file buf0flu.cc line 1165
          2020-06-18 15:37:20 0 [Warning] InnoDB: A long semaphore wait:
          --Thread 139712923383552 has waited at buf0lru.cc line 936 for 241.00 seconds the semaphore:
          Mutex at 0x5585f8439e30, Mutex BUF_POOL created buf0buf.cc:1873, lock var 2
          Last time reserved in file /home/mikko/work/codership_mariadb/mariadb-server/storage/innobase/buf/buf0buf.cc line 6214
          2020-06-18 15:37:20 0 [Warning] InnoDB: A long semaphore wait:
          --Thread 139713340323584 has waited at btr0pcur.ic line 545 for 241.00 seconds the semaphore:
          S-lock on RW-latch at 0x7f117342c4f0 created in file buf0buf.cc line 1580
          a writer (thread id 139712923383552) has reserved it in mode exclusive
          number of readers 0, waiters flag 1, lock_word: 0
          Last time write locked in file row0ins.cc line 2607
          2020-06-18 15:37:20 0 [Note] InnoDB: A semaphore wait:
          --Thread 139712803108608 has waited at buf0flu.cc line 2226 for 240.00 seconds the semaphore:
          Mutex at 0x5585f8439e30, Mutex BUF_POOL created buf0buf.cc:1873, lock var 2
          Last time reserved in file /home/mikko/work/codership_mariadb/mariadb-server/storage/innobase/buf/buf0buf.cc line 6214

          1. test connection processing INSERT, hangs
          Thread 29 (Thread 0x7f11732b3700 (LWP 99285)):
          buf_pool_mutex_enter(buf_pool);

          2. test connection processing SELECT, hangs:
          Thread 30 (Thread 0x7f118c053700 (LWP 99043)):
          rw_lock_s_lock_inline(&block->lock, 0, file, line);

          3. innodb background thread hangs:
          #25 0x00005585f60a6841 in srv_purge_coordinator_thread (arg=0x0) at /home/mikko/work/codership_mariadb/mariadb-server/storage/innobase/srv/srv0srv.cc:2740
          Thread 23 (Thread 0x7f1163fff700 (LWP 99030)):
          rw_lock_x_lock_inline(&block->lock, 0, file, line);

          4. another innodb background threads hangs for the actual innodb_disallow_writes block:
          #15 0x00005585f60a59e6 in srv_master_thread (arg=0x0) at /home/mikko/work/codership_mariadb/mariadb-server/storage/innobase/srv/srv0srv.cc:2384
          Thread 18 (Thread 0x7f1168df4700 (LWP 99029)):
          WAIT_ALLOW_WRITES();

          seppo Seppo Jaakola added a comment - This test can end up in mutex deadlock between innodb background threads and two test connection thread executions: 2020-06-18 15:37:20 0 [Warning] InnoDB: A long semaphore wait: --Thread 139712668890880 has waited at row0row.cc line 1212 for 241.00 seconds the semaphore: X-lock on RW-latch at 0x7f1173416308 created in file buf0buf.cc line 1580 a writer (thread id 0) has reserved it in mode SX number of readers 0, waiters flag 1, lock_word: 10000000 Last time write locked in file buf0flu.cc line 1165 2020-06-18 15:37:20 0 [Warning] InnoDB: A long semaphore wait: --Thread 139712923383552 has waited at buf0lru.cc line 936 for 241.00 seconds the semaphore: Mutex at 0x5585f8439e30, Mutex BUF_POOL created buf0buf.cc:1873, lock var 2 Last time reserved in file /home/mikko/work/codership_mariadb/mariadb-server/storage/innobase/buf/buf0buf.cc line 6214 2020-06-18 15:37:20 0 [Warning] InnoDB: A long semaphore wait: --Thread 139713340323584 has waited at btr0pcur.ic line 545 for 241.00 seconds the semaphore: S-lock on RW-latch at 0x7f117342c4f0 created in file buf0buf.cc line 1580 a writer (thread id 139712923383552) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time write locked in file row0ins.cc line 2607 2020-06-18 15:37:20 0 [Note] InnoDB: A semaphore wait: --Thread 139712803108608 has waited at buf0flu.cc line 2226 for 240.00 seconds the semaphore: Mutex at 0x5585f8439e30, Mutex BUF_POOL created buf0buf.cc:1873, lock var 2 Last time reserved in file /home/mikko/work/codership_mariadb/mariadb-server/storage/innobase/buf/buf0buf.cc line 6214 1. test connection processing INSERT, hangs Thread 29 (Thread 0x7f11732b3700 (LWP 99285)): buf_pool_mutex_enter(buf_pool); 2. test connection processing SELECT, hangs: Thread 30 (Thread 0x7f118c053700 (LWP 99043)): rw_lock_s_lock_inline(&block->lock, 0, file, line); 3. innodb background thread hangs: #25 0x00005585f60a6841 in srv_purge_coordinator_thread (arg=0x0) at /home/mikko/work/codership_mariadb/mariadb-server/storage/innobase/srv/srv0srv.cc:2740 Thread 23 (Thread 0x7f1163fff700 (LWP 99030)): rw_lock_x_lock_inline(&block->lock, 0, file, line); 4. another innodb background threads hangs for the actual innodb_disallow_writes block: #15 0x00005585f60a59e6 in srv_master_thread (arg=0x0) at /home/mikko/work/codership_mariadb/mariadb-server/storage/innobase/srv/srv0srv.cc:2384 Thread 18 (Thread 0x7f1168df4700 (LWP 99029)): WAIT_ALLOW_WRITES();
          seppo Seppo Jaakola added a comment -

          a PR proposal to fix this will be submitted soon'ish, it contains test logic refactoring

          seppo Seppo Jaakola added a comment - a PR proposal to fix this will be submitted soon'ish, it contains test logic refactoring
          seppo Seppo Jaakola added a comment -

          a PR was submitted

          seppo Seppo Jaakola added a comment - a PR was submitted
          seppo Seppo Jaakola added a comment -

          note: the test was not enabled as part of the PR, it probably should be enabled if PR passes reviews

          seppo Seppo Jaakola added a comment - note: the test was not enabled as part of the PR, it probably should be enabled if PR passes reviews

          People

            jplindst Jan Lindström (Inactive)
            stepan.patryshev Stepan Patryshev (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.