Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36016

10.11.9 unused replica - server hung

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • 10.11.9
    • N/A
    • Server
    • None
    • debian bookworm

    Description

      We are testing a 10.11.9 version of mariadb on a replica - which does nothing apart from replicate. Today the server started to get delayed, and when trying to operate any thing, it was just unresponsive.
      Nothing shown on the logs, but issuing a stop slave; results in the server not doing any of it. Just wating.

      Some traces

      strace: Process 1353 attached
      restart_syscall(<... resuming interrupted read ...>) = 1
      accept4(91, {sa_family=AF_UNIX}, [128 => 2], SOCK_CLOEXEC) = 112
      rt_sigprocmask(SIG_BLOCK, ~[], [HUP INT QUIT PIPE ALRM TERM TSTP], 8) = 0
      clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fdad4d41990, parent_tid=0x7fdad4d41990, exit_signal=0, stack=0x7fdad4d10000, stack_size=0x30e80, tls=0x7fdad4d416c0} => {parent_tid=[1621901]}, 88) = 1621901
      rt_sigprocmask(SIG_SETMASK, [HUP INT QUIT PIPE ALRM TERM TSTP], NULL, 8) = 0
      futex(0x55f951a2dd68, FUTEX_WAKE_PRIVATE, 1) = 1
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavail) = 1 ([{fd=91, revents=POLLIN}])
      accept4(91, {sa_family=AF_UNIX}, [128 => 2], SOCK_CLOEXEC) = 112
      rt_sigprocmask(SIG_BLOCK, ~[], [HUP INT QUIT PIPE ALRM TERM TSTP], 8) = 0
      clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fdad4e9f990, parent_tid=0x7fdad4e9f990, exit_signal=0, stack=0x7fdad4e6e000, stack_size=0x30e80, tls=0x7fdad4e9f6c0} => {parent_tid=[1622673]}, 88) = 1622673
      rt_sigprocmask(SIG_SETMASK, [HUP INT QUIT PIPE ALRM TERM TSTP], NULL, 8) = 0
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      poll([{fd=87, events=POLLIN}, {fd=88, events=POLLIN}, {fd=89, events=POLLIN}, {fd=90, events=POLLIN}, {fd=91, events=POLLIN}], 5, -1) = 1 ([{fd=91, revents=POLLIN}])
      accept4(91, {sa_family=AF_UNIX}, [128 => 2], SOCK_CLOEXEC) = 112
      rt_sigprocmask(SIG_BLOCK, ~[], [HUP INT QUIT PIPE ALRM TERM TSTP], 8) = 0
      clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fdad4d73990, parent_tid=0x7fdad4d73990, exit_signal=0, stack=0x7fdad4d42000, stack_size=0x30e80, tls=0x7fdad4d736c0} => {parent_tid=[1622730]}, 88) = 1622730
      rt_sigprocmask(SIG_SETMASK, [HUP INT QUIT PIPE ALRM TERM TSTP], NULL, 8) = 0
      futex(0x55f951a2e628, FUTEX_WAKE_PRIVATE, 1) = 1
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      poll([{fd=87, events=POLLIN}, {fd=88, events=POLLIN}, {fd=89, events=POLLIN}, {fd=90, events=POLLIN}, {fd=91, events=POLLIN}], 5, -1) = 1 ([{fd=91, revents=POLLIN}])
      accept4(91, {sa_family=AF_UNIX}, [128 => 2], SOCK_CLOEXEC) = 112
      rt_sigprocmask(SIG_BLOCK, ~[], [HUP INT QUIT PIPE ALRM TERM TSTP], 8) = 0
      clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fdad4d0f990, parent_tid=0x7fdad4d0f990, exit_signal=0, stack=0x7fdad4cde000, stack_size=0x30e80, tls=0x7fdad4d0f6c0} => {parent_tid=[1622738]}, 88) = 1622738
      rt_sigprocmask(SIG_SETMASK, [HUP INT QUIT PIPE ALRM TERM TSTP], NULL, 8) = 0
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      poll([{fd=87, events=POLLIN}, {fd=88, events=POLLIN}, {fd=89, events=POLLIN}, {fd=90, events=POLLIN}, {fd=91, events=POLLIN}], 5, -1) = 1 ([{fd=91, revents=POLLIN}])
      accept4(91, {sa_family=AF_UNIX}, [128 => 2], SOCK_CLOEXEC) = 112
      rt_sigprocmask(SIG_BLOCK, ~[], [HUP INT QUIT PIPE ALRM TERM TSTP], 8) = 0
      clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fdad4f03990, parent_tid=0x7fdad4f03990, exit_signal=0, stack=0x7fdad4ed2000, stack_size=0x30e80, tls=0x7fdad4f036c0} => {parent_tid=[1622740]}, 88) = 1622740
      rt_sigprocmask(SIG_SETMASK, [HUP INT QUIT PIPE ALRM TERM TSTP], NULL, 8) = 0
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      poll([{fd=87, events=POLLIN}, {fd=88, events=POLLIN}, {fd=89, events=POLLIN}, {fd=90, events=POLLIN}, {fd=91, events=POLLIN}], 5, -1) = 1 ([{fd=91, revents=POLLIN}])
      accept4(91, {sa_family=AF_UNIX}, [128 => 2], SOCK_CLOEXEC) = 112
      rt_sigprocmask(SIG_BLOCK, ~[], [HUP INT QUIT PIPE ALRM TERM TSTP], 8) = 0
      clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fdad4d41990, parent_tid=0x7fdad4d41990, exit_signal=0, stack=0x7fdad4d10000, stack_size=0x30e80, tls=0x7fdad4d416c0} => {parent_tid=[1622763]}, 88) = 1622763
      rt_sigprocmask(SIG_SETMASK, [HUP INT QUIT PIPE ALRM TERM TSTP], NULL, 8) = 0
      futex(0x55f951a2e9e8, FUTEX_WAKE_PRIVATE, 1) = 1
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      poll([{fd=87, events=POLLIN}, {fd=88, events=POLLIN}, {fd=89, events=POLLIN}, {fd=90, events=POLLIN}, {fd=91, events=POLLIN}], 5, -1) = 1 ([{fd=91, revents=POLLIN}])
      accept4(91, {sa_family=AF_UNIX}, [128 => 2], SOCK_CLOEXEC) = 112
      rt_sigprocmask(SIG_BLOCK, ~[], [HUP INT QUIT PIPE ALRM TERM TSTP], 8) = 0
      clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fdad4e09990, parent_tid=0x7fdad4e09990, exit_signal=0, stack=0x7fdad4dd8000, stack_size=0x30e80, tls=0x7fdad4e096c0} => {parent_tid=[1622769]}, 88) = 1622769
      rt_sigprocmask(SIG_SETMASK, [HUP INT QUIT PIPE ALRM TERM TSTP], NULL, 8) = 0
      futex(0x55f951a2eb28, FUTEX_WAKE_PRIVATE, 1) = 1
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      accept4(91, 0x7ffc43c1f060, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
      poll([{fd=87, events=POLLIN}, {fd=88, events=POLLIN}, {fd=89, events=POLLIN}, {fd=90, events=POLLIN}, {fd=91, events=POLLIN}], 5, -1
      

      I've also attached the output of

      sudo gdb --batch --eval-command="set print frame-arguments all"  --eval-command="thread apply all bt full" /usr/sbin/mariadbd $(pgrep -xn mariadbd)  > mariadbd_full_bt_all_threads.txt
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            marostegui Manuel Arostegui
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.