Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15792

Fix mtr to be able to wait for >1 exited mysqld

Details

    Description

      https://github.com/MariaDB/server/pull/665

      Tests affected:

      Attachments

        Issue Links

          Activity

            I have no objections to the patch, but please push into a development tree first.

            elenst Elena Stepanova added a comment - I have no objections to the patch, but please push into a development tree first.

            As it turns out, the patch requires amendments.

            First, it causes ERROR: wait_any failed when tests are run with testcase-timeout > 20. It is currently being fixed in https://github.com/MariaDB/server/pull/709#issuecomment-383030848.

            Another, and more trickier problem, is a race condition / non-determinism in processing actual crashes.
            It presents like this (for example, the test case from MDEV-15878 can be used to reproduce it):

            worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
            CREATE TABLE t1 (f INT) ENGINE=InnoDB;
            INSERT INTO t1 VALUES (1),(2);
            ALTER TABLE t1 ORDER BY unknown_column;
            ERROR 42S22: Unknown column 'unknown_column' in 'order clause'
            CREATE TABLE t2 ENGINE=Aria SELECT * FROM t1;
            SELECT * FROM t2;
            worker[1] Trying to dump core for [mysqltest - pid: 2476, winpid: 2476, exit: 256]
            worker[1] Trying to dump core for [mysqld.1 - pid: 2444, winpid: 2444, exit: 256]
            worker[1] mysql-test-run: *** ERROR: Unhandled process [mysqltest - pid: 2476, winpid: 2476, exit: 256] exited
            mysql-test-run: *** ERROR: Test suite aborted
            

            Possible reason is that SRVDIED logic lies outside the foreach $proc (keys(%keep_waiting_proc)) loop. so whenever the process remaining in $proc is not the server process, the crash doesn't get handled properly.

            elenst Elena Stepanova added a comment - As it turns out, the patch requires amendments. First, it causes ERROR: wait_any failed when tests are run with testcase-timeout > 20 . It is currently being fixed in https://github.com/MariaDB/server/pull/709#issuecomment-383030848 . Another, and more trickier problem, is a race condition / non-determinism in processing actual crashes. It presents like this (for example, the test case from MDEV-15878 can be used to reproduce it): worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 CREATE TABLE t1 (f INT) ENGINE=InnoDB; INSERT INTO t1 VALUES (1),(2); ALTER TABLE t1 ORDER BY unknown_column; ERROR 42S22: Unknown column 'unknown_column' in 'order clause' CREATE TABLE t2 ENGINE=Aria SELECT * FROM t1; SELECT * FROM t2; worker[1] Trying to dump core for [mysqltest - pid: 2476, winpid: 2476, exit: 256] worker[1] Trying to dump core for [mysqld.1 - pid: 2444, winpid: 2444, exit: 256] worker[1] mysql-test-run: *** ERROR: Unhandled process [mysqltest - pid: 2476, winpid: 2476, exit: 256] exited mysql-test-run: *** ERROR: Test suite aborted Possible reason is that SRVDIED logic lies outside the foreach $proc (keys(%keep_waiting_proc)) loop. so whenever the process remaining in $proc is not the server process, the crash doesn't get handled properly.

            The version of pull request #709 of Apr 25 (with commit 9f0d9012) seems to be fixing problems observed locally and in buildbot, verified in buildbot on bb-10.2-mtr tree; however, please note review comments by serg, some changes have been requested.

            elenst Elena Stepanova added a comment - The version of pull request #709 of Apr 25 (with commit 9f0d9012) seems to be fixing problems observed locally and in buildbot, verified in buildbot on bb-10.2-mtr tree; however, please note review comments by serg , some changes have been requested.

            People

              serg Sergei Golubchik
              jplindst Jan Lindström (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.