Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-35094

MTR breaks plugin rr traces

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.5
    • 10.5
    • Tests
    • None

    Description

      Reproduce

      Compile and run some plugin test with rr tracing:

      mtr --rr spider.vers_derived
      

      Result

      Broken backtrace for ha_spider.so:

      Thread 14 (Thread 3415313.3415527 (mariadbd)):
      #0  0x0000000070000002 in syscall_traced ()
      #1  0x00007503ea93ec9b in _raw_syscall () at /usr/src/rr-5.8.0-1/src/preload/raw_syscall.S:120
      #2  0x00007503ea9388bd in traced_raw_syscall (call=call@entry=0x7503c7bfffa0) at /usr/src/rr-5.8.0-1/src/preload/syscallbuf.c:377
      #3  0x00007503ea93a337 in sys_poll (call=call@entry=0x7503c7bfffa0) at /usr/src/rr-5.8.0-1/src/preload/syscallbuf.c:2660
      #4  0x00007503ea93ca64 in syscall_hook_internal (call=0x7503c7bfffa0) at /usr/src/rr-5.8.0-1/src/preload/syscallbuf.c:4218
      #5  syscall_hook (call=0x7503c7bfffa0) at /usr/src/rr-5.8.0-1/src/preload/syscallbuf.c:4355
      #6  syscall_hook (call=0x7503c7bfffa0) at /usr/src/rr-5.8.0-1/src/preload/syscallbuf.c:4339
      #7  0x00007503ea9382d3 in _syscall_hook_trampoline () at /usr/src/rr-5.8.0-1/src/preload/syscall_hook.S:308
      #8  0x00007503ea93833d in __morestack () at /usr/src/rr-5.8.0-1/src/preload/syscall_hook.S:443
      #9  0x00007503ea938359 in _syscall_hook_trampoline_48_3d_00_f0_ff_ff () at /usr/src/rr-5.8.0-1/src/preload/syscall_hook.S:462
      #10 0x00007503ea11a215 in __GI___poll (fds=0x7503e807f978, nfds=1, timeout=600000) at ../sysdeps/unix/sysv/linux/poll.c:29
      #11 0x00005e28cb88528e in vio_io_wait (vio=0x7503cc405028, event=VIO_IO_EVENT_READ, timeout=600000) at ../src/vio/viosocket.c:1000
      #12 0x00005e28cb884ff5 in vio_socket_io_wait (vio=0x7503cc405028, event=VIO_IO_EVENT_READ) at ../src/vio/viosocket.c:118
      #13 0x00005e28cb88564f in vio_read (vio=0x7503cc405028, buf=0x7503cc0788d8 "\a", size=16384) at ../src/vio/viosocket.c:199
      #14 0x00005e28cb885ba6 in vio_read_buff (vio=0x7503cc405028, buf=0x7503cc2a9b88 "\230", size=4) at ../src/vio/viosocket.c:247
      #15 0x00005e28cb3449fb in my_real_read (net=0x7503cc1fad78, complen=0x7503e807fbf0, header=0 '\000') at ../src/sql/net_serv.cc:996
      #16 0x00005e28cb34447e in my_net_read_packet_reallen (net=0x7503cc1fad78, read_from_server=0 '\000', reallen=0x7503e807fcb0) at ../src/sql/net_serv.cc:1277
      #17 0x00005e28cb3390b9 in cli_safe_read_reallen (mysql=0x7503cc1fad78, reallen=0x7503e807fcb0) at ../src/sql-common/client.c:385
      #18 0x00005e28cb339071 in cli_safe_read (mysql=0x7503cc1fad78) at ../src/sql-common/client.c:374
      #19 0x00005e28cb3425dc in cli_read_query_result (mysql=0x7503cc1fad78) at ../src/sql-common/client.c:3533
      #20 0x00005e28cb34025d in server_mysql_real_query (mysql=0x7503cc1fad78, query=0x7503cc393818 "select `x`,`row_start`,`row_end` from `test`.`t10` where ((`row_end` > '2024-10-07 17:35:46.398977') and (`row_start` <= '2024-10-07 17:35:46.398977'))", length=151) at ../src/sql-common/client.c:3627
      #21 0x00007503e0193648 in ?? ()
      #22 0x0000000100010000 in ?? ()
      #23 0x00005e28cbf3bb62 in _my_thread_var () at ../src/mysys/my_thr_init.c:389
      #24 0x00007503e00cac39 in ?? ()
      #25 0x00007503e8080170 in ?? ()
      #26 0x00005e28cbf3c4e0 in my_thread_var_dbug () at ../src/mysys/my_thr_init.c:439
      #27 0x00007503e01c42fc in ?? ()
      #28 0x00007503e80802a0 in ?? ()
      #29 0x00007503cc406bb8 in ?? ()
      #30 0x00007503cc406558 in ?? ()
      #31 0x00000000cb5ad320 in ?? ()
      #32 0x0000010000000001 in ?? ()
      #33 0x00007503cc001618 in ?? ()
      #34 0x00007503e0082234 in ?? ()
      #35 0x00007503e0084f65 in ?? ()
      #36 0x000005e38000000e in ?? ()
      #37 0x00007503cc0015b0 in ?? ()
      #38 0x0000000000000097 in ?? ()
      #39 0x0000000000000001 in ?? ()
      #40 0x00000000e8080320 in ?? ()
      #41 0x00007503cc2d3fb0 in ?? ()
      #42 0x0000000dcc18ad88 in ?? ()
      #43 0x00000258e80802e8 in ?? ()
      #44 0x00007503e0075476 in ?? ()
      #45 0x00007503e00819de in ?? ()
      #46 0x000031de8000000d in ?? ()
      #47 0x00007503e8080438 in ?? ()
      #48 0x00000097e0075476 in ?? ()
      #49 0x00007503cc2d4000 in ?? ()
      #50 0x00007503cc3d6a98 in ?? ()
      #51 0x00000003e8080438 in ?? ()
      #52 0x00007503cc406bb8 in ?? ()
      #53 0x0000000000000001 in ?? ()
      #54 0x00007503cc2d3fb0 in ?? ()
      #55 0x00007503cc406bb8 in ?? ()
      #56 0x00007503e8080370 in ?? ()
      #57 0x00007503e0164fb2 in ?? ()
      #58 0x00007503cc2d3fb0 in ?? ()
      #59 0x00007503cc406bb8 in ?? ()
      #60 0x0000000000000001 in ?? ()
      #61 0x00007503e8080464 in ?? ()
      #62 0x0000000000000000 in ?? ()
      
      

      Notes

      Running `mtr --manual-gdb` works just fine and produces good traces. Any manipulations with SafeProcess::start_kill() (replacing SIGTERM with SIGINT, adding sleep) didn't help. start_kill() is used by stop_servers() which is run from main::run_worker().

      Logging with:

      export RR_LOG=all:debug
      

      produces following differences:

      bad.log

      [record_signal] 3459971: handling signal SIGSTKFLT (pevent: PTRACE_EVENT(0), event: (none)
      [record_signal] Not safe to deliver signal at 0x73de3154345a
      [RecordTask] Set signal mask to block all signals (bar SYSCALLBUF_DESCHED_SIGNAL/TIME_SLICE_SIGNAL) while we  have a stashed signal
      [record_signal] Reached final syscallbuf instruction, singlestepping to enable signal dispatch
      [record_signal] 3459971: handling signal SIGSTKFLT (pevent: PTRACE_EVENT(0), event: (none)
      [record_signal] Safe to deliver signal at 0x73de30d2a882 because not in syscallbuf
      [record_signal] 3459971: handling signal SIGPWR (pevent: PTRACE_EVENT(0), event: (none)
      [record_signal]   dropping ignored SIGPWR
      [record_signal]   (at disarm-desched, so finished buffered syscall; resuming)
      ...
      [record_signal] 3459971: handling signal SIGSTKFLT (pevent: PTRACE_EVENT(0), event: (none)
      [record_signal] Not safe to deliver signal at 0x73de3154345a
      [RecordTask] Set signal mask to block all signals (bar SYSCALLBUF_DESCHED_SIGNAL/TIME_SLICE_SIGNAL) while we  have a stashed signal
      [record_signal] Reached final syscallbuf instruction, singlestepping to enable signal dispatch
      [record_signal] 3459971: handling signal SIGSTKFLT (pevent: PTRACE_EVENT(0), event: (none)
      [record_signal] Safe to deliver signal at 0x73de30d2a882 because not in syscallbuf
      [record_signal] 3459971: handling signal SIGPWR (pevent: PTRACE_EVENT(0), event: (none)
      [record_signal]   dropping ignored SIGPWR
      [record_signal]   (at disarm-desched, so finished buffered syscall; resuming)
      2024-10-07 19:54:43 0 [Note] Got signal 15 to shutdown mysqld
      [record_signal] 3459739: handling signal SIGUSR1 (pevent: PTRACE_EVENT(0), event: SYSCALL_INTERRUPTION: poll
      

      good.log

      [record_signal] 3462104: handling signal SIGSTKFLT (pevent: PTRACE_EVENT(0), event: (none)
      [record_signal] Safe to deliver signal at 0x601027a14905 because not in syscallbuf
      

      Full logs attached.

      Attachments

        1. rr-bad.log.xz
          139 kB
          Aleksey Midenkov
        2. rr-good.log.xz
          144 kB
          Aleksey Midenkov
        3. signal-bad.log
          42 kB
          Aleksey Midenkov
        4. signal-good.log
          40 kB
          Aleksey Midenkov

        Activity

          People

            midenok Aleksey Midenkov
            midenok Aleksey Midenkov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.