Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23186

mysqld doesn't create core dump if crashing while backtracing or dumping memory

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Not a Bug
    • 10.4.13
    • N/A
    • Server
    • None

    Description

      When a 2nd crash happens in the crash handling routine, the core dump function doesn't get called. As a result, no core file can be obtained and crash can't be analyzed.

      Let's create a config file or command line option to go straight to core dumping in case of crash.

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment -

            To skip the stack trace you can use --skip-stack-trace on command line or as config option.

            https://mariadb.com/kb/en/mysqld-options/#-stack-trace

            danblack Daniel Black added a comment - To skip the stack trace you can use --skip-stack-trace on command line or as config option. https://mariadb.com/kb/en/mysqld-options/#-stack-trace

            danblack will this also skip memory dump?

            rpizzi Rick Pizzi (Inactive) added a comment - danblack will this also skip memory dump?
            rpizzi Rick Pizzi (Inactive) added a comment - - edited

            Perusing the source code, I see that in case of a double segfault, core dump is skipped by default, regardless the value of that command line option. This explains why we aren't getting a core file....

            extern "C" sig_handler handle_fatal_signal(int sig)
            {
              time_t curr_time;
              struct tm tm;
             
            #ifdef HAVE_STACKTRACE
              THD *thd;
              /*
                 This flag remembers if the query pointer was found invalid.
                 We will try and print the query at the end of the signal handler, in case
                 we're wrong.
              */
              bool print_invalid_query_pointer= false;
            #endif
             
              if (segfaulted)
              {
                my_safe_printf_stderr("Fatal " SIGNAL_FMT " while backtracing\n", sig);
                goto end;
              }
             
              segfaulted = 1;
             
            [  ... ]
             
            #ifdef HAVE_WRITE_CORE
              if (test_flags & TEST_CORE_ON_SIGNAL)
              {
                my_write_core(sig);
              }
            #endif
             
            end:
            #ifndef __WIN__
              /*
                 Quit, without running destructors (etc.)
                 Use a signal, because the parent (systemd) can check that with WIFSIGNALED
                 On Windows, do not terminate, but pass control to exception filter.
              */
              signal(sig, SIG_DFL);
              kill(getpid(), sig);
            
            

            rpizzi Rick Pizzi (Inactive) added a comment - - edited Perusing the source code, I see that in case of a double segfault, core dump is skipped by default, regardless the value of that command line option. This explains why we aren't getting a core file.... extern "C" sig_handler handle_fatal_signal(int sig) { time_t curr_time; struct tm tm;   #ifdef HAVE_STACKTRACE THD *thd; /* This flag remembers if the query pointer was found invalid. We will try and print the query at the end of the signal handler, in case we're wrong. */ bool print_invalid_query_pointer= false; #endif   if (segfaulted) { my_safe_printf_stderr("Fatal " SIGNAL_FMT " while backtracing\n", sig); goto end; }   segfaulted = 1;   [ ... ]   #ifdef HAVE_WRITE_CORE if (test_flags & TEST_CORE_ON_SIGNAL) { my_write_core(sig); } #endif   end: #ifndef __WIN__ /* Quit, without running destructors (etc.) Use a signal, because the parent (systemd) can check that with WIFSIGNALED On Windows, do not terminate, but pass control to exception filter. */ signal(sig, SIG_DFL); kill(getpid(), sig);

            Following the code, it appears that the crash is indeed inside my_print_backtrace() and the actual memory dump is from the system library and not from MariaDB. We will try to run with the option you have mentioned. Thanks.

            rpizzi Rick Pizzi (Inactive) added a comment - Following the code, it appears that the crash is indeed inside my_print_backtrace() and the actual memory dump is from the system library and not from MariaDB. We will try to run with the option you have mentioned. Thanks.
            danblack Daniel Black added a comment -

            You're welcome. It did look like your other MDEV crashed in my_print_backtrace. It doesn't quite eliminate the entire signal handler, however as you've seen its quite minimal with ---skip-stack-trace enabled.. Best wishes resolving the initial crash.

            danblack Daniel Black added a comment - You're welcome. It did look like your other MDEV crashed in my_print_backtrace. It doesn't quite eliminate the entire signal handler, however as you've seen its quite minimal with ---skip-stack-trace enabled. . Best wishes resolving the initial crash.

            I suspect that MDEV-14229 increased the probability of this happening, due to invoking more code in the signal handler.

            marko Marko Mäkelä added a comment - I suspect that MDEV-14229 increased the probability of this happening, due to invoking more code in the signal handler.

            People

              Unassigned Unassigned
              rpizzi Rick Pizzi (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.