Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23777

mtr --rr leaves a broken trace in some cases

Details

    • Task
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      We have a working patch that fixes a problem:

       Index: mysql-test/lib/My/SafeProcess/safe_process.cc
       IDEA additional info:
       Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
       <+>UTF-8
       ===================================================================
       --- mysql-test/lib/My/SafeProcess/safe_process.cc    (revision a7e7508a75f6bd87ac8e8d1f32930e1f3799d226)
       +++ mysql-test/lib/My/SafeProcess/safe_process.cc    (date 1596518504000)
       @@ -97,7 +97,7 @@
          message("Killing child: %d", child_pid);
          // Terminate whole process group
          if (! was_killed)
       -    kill(-child_pid, SIGKILL);
       +    kill(-child_pid, SIGINT);
        
          pid_t ret_pid= waitpid(child_pid, &status, 0);
          if (ret_pid == child_pid)
      

      However it needs a hang defence to be in trunk.

      Marko: IMO that one should be replaced with something like SIGINT, wait a bit, SIGABRT (to get proof of server shutdown hang), wait a bit more, then SIGKILL

      Attachments

        Issue Links

          Activity

            I think that sometimes this is not enough. But, if we apply all of my patch below (which I used on 10.5 to make one test rr-friendly), then replication tests will start failing massively (because apparently they like to SIGKILL processes).

            diff --git a/client/mysqltest.cc b/client/mysqltest.cc
            index 417d3615995..48b8f132eb2 100644
            --- a/client/mysqltest.cc
            +++ b/client/mysqltest.cc
            @@ -5141,7 +5141,7 @@ void do_shutdown_server(struct st_command *command)
                 if (timeout)
                   (void) my_kill(pid, SIGABRT);
                 /* Give server a few seconds to die in all cases */
            -    if (!timeout || wait_until_dead(pid, timeout < 5 ? 5 : timeout))
            +    if (!timeout || wait_until_dead(pid, timeout < 60 ? 60 : timeout))
                 {
                   (void) my_kill(pid, SIGKILL);
                 }
            diff --git a/mysql-test/lib/My/SafeProcess/safe_process.cc b/mysql-test/lib/My/SafeProcess/safe_process.cc
            index 4d0d1e2a3a0..abc167a4300 100644
            --- a/mysql-test/lib/My/SafeProcess/safe_process.cc
            +++ b/mysql-test/lib/My/SafeProcess/safe_process.cc
            @@ -144,7 +144,7 @@ static int kill_child(bool was_killed)
               message("Killing child: %d", child_pid);
               // Terminate whole process group
               if (! was_killed)
            -    kill(-child_pid, SIGKILL);
            +    kill(-child_pid, SIGABRT);
             
               pid_t ret_pid= waitpid(child_pid, &status, 0);
               if (ret_pid == child_pid)
            diff --git a/mysql-test/lib/v1/mtr_process.pl b/mysql-test/lib/v1/mtr_process.pl
            index fd9f3817699..ee9a370c467 100644
            --- a/mysql-test/lib/v1/mtr_process.pl
            +++ b/mysql-test/lib/v1/mtr_process.pl
            @@ -456,8 +456,8 @@ sub mtr_kill_leftovers () {
                     my $retries= 10;                    # 10 seconds
                     do
                     {
            -          mtr_debug("Sending SIGKILL to pids: " . join(' ', @pids));
            -          kill(9, @pids);
            +          mtr_debug("Sending SIGABRT to pids: " . join(' ', @pids));
            +          kill(6, @pids);
                       mtr_report("Sleep 1 second waiting for processes to die");
                       sleep(1)                      # Wait one second
                     } while ( $retries-- and  kill(0, @pids) );
            

            Maybe a subset of this would be safe to apply?

            marko Marko Mäkelä added a comment - I think that sometimes this is not enough. But, if we apply all of my patch below (which I used on 10.5 to make one test rr -friendly), then replication tests will start failing massively (because apparently they like to SIGKILL processes). diff --git a/client/mysqltest.cc b/client/mysqltest.cc index 417d3615995..48b8f132eb2 100644 --- a/client/mysqltest.cc +++ b/client/mysqltest.cc @@ -5141,7 +5141,7 @@ void do_shutdown_server(struct st_command *command) if (timeout) (void) my_kill(pid, SIGABRT); /* Give server a few seconds to die in all cases */ - if (!timeout || wait_until_dead(pid, timeout < 5 ? 5 : timeout)) + if (!timeout || wait_until_dead(pid, timeout < 60 ? 60 : timeout)) { (void) my_kill(pid, SIGKILL); } diff --git a/mysql-test/lib/My/SafeProcess/safe_process.cc b/mysql-test/lib/My/SafeProcess/safe_process.cc index 4d0d1e2a3a0..abc167a4300 100644 --- a/mysql-test/lib/My/SafeProcess/safe_process.cc +++ b/mysql-test/lib/My/SafeProcess/safe_process.cc @@ -144,7 +144,7 @@ static int kill_child(bool was_killed) message("Killing child: %d", child_pid); // Terminate whole process group if (! was_killed) - kill(-child_pid, SIGKILL); + kill(-child_pid, SIGABRT); pid_t ret_pid= waitpid(child_pid, &status, 0); if (ret_pid == child_pid) diff --git a/mysql-test/lib/v1/mtr_process.pl b/mysql-test/lib/v1/mtr_process.pl index fd9f3817699..ee9a370c467 100644 --- a/mysql-test/lib/v1/mtr_process.pl +++ b/mysql-test/lib/v1/mtr_process.pl @@ -456,8 +456,8 @@ sub mtr_kill_leftovers () { my $retries= 10; # 10 seconds do { - mtr_debug("Sending SIGKILL to pids: " . join(' ', @pids)); - kill(9, @pids); + mtr_debug("Sending SIGABRT to pids: " . join(' ', @pids)); + kill(6, @pids); mtr_report("Sleep 1 second waiting for processes to die"); sleep(1) # Wait one second } while ( $retries-- and kill(0, @pids) ); Maybe a subset of this would be safe to apply?

            People

              midenok Aleksey Midenkov
              nikitamalyavin Nikita Malyavin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.