Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9861

Network outage can break replication

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Minor
    • Resolution: Unresolved
    • 10.1.11
    • 10.1(EOL)
    • Replication
    • None

    Description

      We use parallel row-based replication for a few channels (20 ¯_(ツ)_/¯).

      Almost every time we reboot a master server (MySQL 5.6 @ RDS) or have a network outage, we get some HA_ERR_FOUND_DUPP_KEY errors right after replication resumes (or "a foreign key constraint fails" in statement-based replication). Skipping those errors produces more errors, and idempotent mode lead to another errors.

      Since network outage is a kind of expected event, I'd consider this behaviour a bug.

      Our configuration:

      [mysqld]
      basedir = /nix/store/ja84bcvggh2aribmvjqj0rr89d066dkw-mariadb-10.1.11
      init_file = /nix/store/h7qfaw1m15w5sr69xqvqm0468ag7dvd6-init
      pid_file = /run/mysqld/mysqld.pid
      plugin_load = unix_socket=auth_socket.so
      datadir = /mariadb/db
      event_scheduler = ON
      group_concat_max_len = 8388608
      ignore_db_dirs = lost+found
      innodb_buffer_pool_dump_at_shutdown = ON
      innodb_buffer_pool_instances = 64
      innodb_buffer_pool_load_at_startup = ON
      innodb_buffer_pool_size = 19327352832
      innodb_file_format = barracuda
      innodb_file_per_table = ON
      innodb_flush_log_at_trx_commit = 2
      innodb_flush_method = O_DIRECT
      innodb_lock_wait_timeout = 1800
      innodb_log_file_size = 314572800
      join_buffer_size = 1048576
      log_slave_updates = OFF
      max_allowed_packet = 268435456
      max_connections = 1000
      net_read_timeout = 1000
      net_write_timeout = 1000
      port = 3306
      query_cache_limit = 2097152
      query_cache_size = 33554432
      query_cache_strip_comments = ON
      query_cache_type = 1
      relay_log = /mariadb/relay/cat-bin
      server_id = 123456
      skip_log_bin
      slave_compressed_protocol = ON
      slave_domain_parallel_threads = 10
      slave_net_timeout = 600
      slave_parallel_max_queued = 8388608
      slave_parallel_threads = 40
      sort_buffer_size = 4194304
      ssl_cert = /nix/store/rmlq71y8fnyx1k7sp3qx55fqwg45pl40-cats-cert.pem
      ssl_key = /run/keys/cats-key.pem
      table_open_cache = 30000
      # replication options:
      !include /nix/store/z3pldfvs7wpd8bhz6pbnr73zqagnxmx1-mysqld-repl.cnf
      
      

      Attachments

        Activity

          People

            Elkin Andrei Elkin
            ip1981 Igor Pashev
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.