Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30515

stop slave causes the master server crash with semi-sync replication

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • 10.5, 10.6
    • Replication, Server
    • None
    • centos 7

    Description

      The customer reported that stop slave caused the master down/crash. During my repro, I was able to see that master got crashed once slave stopped manually.
      The followings are the error log from master and slave when slave was stopped manually.
      For testing purpose, sysbech was run against backend server via 2 maxscale servers.

      • 200 threads with 90:10 (write/read workload) - port : 2582
      • 200 threads with 1000 (read only workload) - port : 2482
      • on master

        2023-01-31  8:55:59 4481 [ERROR] Semi-sync master failed on net_flush() before waiting for slave reply
        2023-01-31  8:55:59 4481 [Note] Stop semi-sync binlog_dump to slave (server_id: 2)
        2023-01-31  8:56:00 4481 [Warning] Aborted connection 4481 to db: 'unconnected' user: 'repl' host: '192.168.254.53' (Failed to run hook 'after_send_event')
        2023-01-31  8:56:09 4204 [Warning] Timeout waiting for reply of binlog (file: mariadb-bin.000001, pos: 214503559), semi-sync up to file mariadb-bin.000001, position 214503085.
        2023-01-31  8:56:09 4204 [Note] Semi-sync replication switched OFF.
        2023-01-31  8:56:09 4901 [Warning] Aborted connection 4901 to db: 'unconnected' user: 'unauthenticated' host: '192.168.254.53' (Got an error reading communication packets)
        2023-01-31  8:56:09 4901 [Warning] Aborted connection 4901 to db: 'unconnected' user: 'unauthenticated' host: '192.168.254.53' (This connection closed normally without authentication)
        2023-01-31  8:56:09 4346 [Warning] Aborted connection 4346 to db: 'unconnected' user: 'mxs' host: '192.168.254.54' (Got an error reading communication packets)
        2023-01-31  8:56:09 4903 [Warning] Aborted connection 4903 to db: 'unconnected' user: 'unauthenticated' host: '192.168.254.55' (Got an error reading communication packets)
        2023-01-31  8:56:09 4903 [Warning] Aborted connection 4903 to db: 'unconnected' user: 'unauthenticated' host: '192.168.254.55' (This connection closed normally without authentication)
        2023-01-31  8:56:09 4904 [Warning] Aborted connection 4904 to db: 'unconnected' user: 'unauthenticated' host: '192.168.254.54' (Got an error reading communication packets)
        2023-01-31  8:56:09 4904 [Warning] Aborted connection 4904 to db: 'unconnected' user: 'unauthenticated' host: '192.168.254.54' (This connection closed normally without authentication)
        2023-01-31  8:56:09 4347 [Warning] Aborted connection 4347 to db: 'unconnected' user: 'mxs' host: '192.168.254.55' (Got an error reading communication packets)
        230131  8:56:10 [ERROR] mysqld got signal 11 ;
        This could be because you hit a bug. It is also possible that this binary
        or one of the libraries it was linked against is corrupt, improperly built,
        or misconfigured. This error can also be caused by malfunctioning hardware.
         
        To report this bug, see https://mariadb.com/kb/en/reporting-bugs
         
        We will try our best to scrape up some info that will hopefully help
        diagnose the problem, but since we have already crashed,
        something is definitely wrong and this may fail.
         
        Server version: 10.5.16-11-MariaDB-enterprise-log
        key_buffer_size=33554432
        read_buffer_size=131072
        max_used_connections=408
        max_threads=65537
        thread_count=301
        It is possible that mysqld could use up to
        key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 143409586 K  bytes of memory
        Hope that's ok; if not, decrease some variables in the equation.
         
        Thread pointer: 0x7f4388437248
        Attempting backtrace. You can use the following information to find out
        where mysqld died. If you see no messages after this, something went
        terribly wrong...
        2023-01-31  8:56:15 0 [Note] mariadbd: Aria engine: starting recovery
        recovered pages: 0% 25% 59% 92% 100% (0.0 seconds); tables to flush: 1 0
         (0.0 seconds);
        2023-01-31  8:56:15 0 [Note] mariadbd: Aria engine: recovery done
        

      • on slave

        2023-01-31  8:55:59 6387 [Note] Error reading relay log event: slave SQL thread was killed
        2023-01-31  8:55:59 6387 [Note] Slave SQL thread exiting, replication stopped in log 'mariadb-bin.000001' at position 214503085; GTID position '0-1-370185'
        2023-01-31  8:55:59 6387 [Note] master was 192.168.254.52:3306
        2023-01-31  8:55:59 6386 [Note] Slave I/O thread exiting, read up to log 'mariadb-bin.000001', position 214503085; GTID position 0-1-370185
        2023-01-31  8:55:59 6386 [Note] master was 192.168.254.52:3306
        2023-01-31  8:56:01 6386 [Note] cannot connect to master to kill slave io_thread's connection
        2023-01-31  9:01:12 6340 [Warning] Aborted connection 6340 to db: 'unconnected' user: 'root' host: 'localhost' (Got timeout reading communication packets)
        

      This can be reproducible in 10.5.x and 10.6.x latest versions.

      Attachments

        1. master.7z.001
          10.00 MB
        2. master.7z.002
          3.68 MB
        3. maxscale.cnf
          1 kB
        4. server.cnf
          3 kB
        5. slave.zip
          313 kB

        Activity

          People

            sanja Oleksandr Byelkin
            allen.lee@mariadb.com Allen Lee (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.